Overwatch continuously ingests AIS vessel positions at full resolution — often multiple reports per vessel per second. To keep the live database performant without losing historical data, Overwatch applies a three-stage lifecycle: hot retention in the live database, tiered downsampling to reduce density over time, and long-term archival to cold storage.
How it works
A nightly maintenance job runs in two phases:
- Archive — data older than the retention window is compressed and written to cold storage, then removed from the live database.
- Downsample — data that remains in the live database is thinned to progressively lower resolution as it ages.
Full-resolution data is always preserved in the archive. If you need to query historical data at original resolution, it can be rehydrated on request.
Retention windows
Each data type has a retention window that controls how long it stays in the live database before being archived.
| Data type | Retention window | Notes |
|---|
| AIS positions | 90 days | Largest table by volume. Downsampled in place at 7 and 30 days, then archived raw to cold storage at 90 days. |
| Port events | 60 days | Intelligence is also captured in vessel visits. |
| Dark events | 90 days | AIS gap and dark activity detections. |
| STS events | 90 days | Ship-to-ship transfer encounters. |
| Loitering events | 90 days | Prolonged stationary vessel detections. |
| Ingestion logs | 14 days | Internal pipeline telemetry. |
Archived data is retained indefinitely at full resolution in cold storage. Only the live database copy is removed after the retention window.
AIS position lifecycle
AIS positions move through four stages as they age. The first three stages keep data in the live database at progressively coarser resolution; the fourth moves raw data to cold storage.
| Stage | Age range | Where it lives | Resolution | What it means |
|---|
| Live | 0–7 days | Live database | Full resolution | Every report is kept as received (often multiple per second). |
| Tier 0 | 7–30 days | Live database | 1 per minute | One position per vessel per minute. Sufficient for route replay and behavioral analysis. |
| Tier 1 | 30–90 days | Live database | 1 per 5 minutes | One position per vessel per 5-minute window. Useful for historical track review. |
| Archive | Over 90 days | Cold storage (R2) | Full resolution | Raw rows are written to compressed JSONL and removed from the live database. |
Data older than 90 days is preserved at full resolution in the archive — not downsampled. This means retrospective analyses that need fine-grained position history (for example, ship-to-ship transfer reconstruction or motif detection) can be answered at original resolution by rehydrating from the archive.
Resolution impact
At full resolution, a single vessel broadcasting every 2 seconds produces approximately 43,000 position reports per day. After in-place downsampling:
| Stage | Reports per vessel per day |
|---|
| Live | ~43,000 |
| Tier 0 (1 min) | ~1,440 |
| Tier 1 (5 min) | ~288 |
| Archive (raw) | ~43,000 (in cold storage) |
What this means for your queries
- Real-time tracking and alerting (0–7 days) — full resolution, no data loss.
- Recent investigations (7–30 days) — 1-minute resolution is sufficient for route reconstruction, speed profiling, and anomaly detection.
- Historical analysis (30–90 days) — 5-minute resolution shows vessel tracks and port visits clearly but may miss brief maneuvers.
- Long-term and retrospective analysis (90+ days) — full-resolution data is available from the archive on request. Live-database queries do not return data older than 90 days; contact support to rehydrate the date range you need.
Archive storage
Archived data is stored in Cloudflare R2 as gzip-compressed JSONL, organized by table and date range. The archive retains full-resolution data indefinitely, so no information is permanently lost when records leave the live database.
A daily cron runs at 03:00 UTC and archives any rows older than the retention window. AIS position archival is capped at 200,000 rows per run so that a backlog cannot starve the smaller archives (port events, transit metrics, POI snapshots) that share the same cron budget. At normal ingest volume the cap clears roughly two hours of historical raw positions per day, which is more than enough to keep up once the system reaches steady state.
If you need to query archived data at its original resolution — for example, to reconstruct a vessel’s exact track from several months ago — contact support to request a rehydration.
Adjusting retention
Retention windows are configured at the platform level and are not user-adjustable. If your use case requires longer hot retention (for example, keeping 90 days of full-resolution AIS positions for ongoing investigations), contact support to discuss options.