The envelope coverage dashboard shows how far along each catalog table is in adopting the APRS data envelope. Open it at Admin → Codex coverage (/admin/codex-coverage) to see, at a glance, which tables are fully compliant, which are in progress, and which haven’t started yet.
Why it matters
Every row in the Codex catalog needs a populated record_id to be traceable, deduplicated, and exportable in audit packages. The coverage dashboard gives you a single view into that progress so you can answer questions like:
- How close is my dataset to full APRS compliance?
- Which tables still have unpopulated records?
- Are nightly backfill jobs actually making progress?
What you’ll see
Summary cards
At the top of the dashboard, four summary cards show:
| Card | Description |
|---|
| Tables tracked | Total number of catalog tables with a record_id column. |
| Fully compliant | Tables where ≥ 99% of rows have a populated record_id. |
| Empty / 0% | Tables where no rows have been backfilled yet. |
| In progress | Everything in between — backfill has started but isn’t complete. |
An overall percentage is displayed alongside the total number of envelope-compliant rows out of all rows across every tracked table.
Coverage table
Below the summary cards, a sortable table lists every catalog table, ordered worst-first by default. Each row shows:
| Column | Description |
|---|
| Table | The catalog table name and its source key (if available). |
| Category | The data category the table belongs to. |
| Rows | Total row count in the table. |
| Populated | Number of rows with a populated record_id. |
| % | Percentage of rows that are envelope-compliant, color-coded by bucket. |
| Last backfill | When the most recent backfill job ran for this table, plus how many rows it processed. |
Color-coded status buckets
The percentage column is color-coded so you can scan for problem areas quickly:
| Color | Range | Meaning |
|---|
| Gray | 0% | No rows backfilled yet. |
| Red | < 25% | Early stages — backfill has started but has a long way to go. |
| Amber | 25%–74% | In progress — backfill is underway. |
| Blue | 75%–98% | Almost there — most rows are compliant. |
| Green | ≥ 99% | Fully compliant. |
How data refreshes
Coverage statistics are refreshed automatically by a nightly job that runs at 05:00 UTC, after the backfill window finishes. The dashboard header shows when the last refresh occurred.
You don’t need to trigger refreshes manually. Each nightly run scans every catalog table that has a record_id column, counts total rows versus populated rows, and updates the dashboard.
If a table was recently added to the catalog, it will appear on the dashboard after the next nightly refresh.
Compliance views
Two SQL views give you direct access to coverage and backfill data without leaving your database client. Use them for ad-hoc queries, alerting integrations, or any workflow where you need the raw numbers behind the dashboard.
codex_compliance_summary
Returns one row per tracked table with coverage percentages for every APRS envelope column, plus h3_index and mmsi. Coverage is derived from PostgreSQL catalog statistics (pg_stats.null_frac), so the query is fast and doesn’t scan table data.
select *
from codex_compliance_summary;
| Column | Description |
|---|
table_name | The catalog table (e.g. ais_positions, vessel_visits). |
table_size | Human-readable table size on disk. |
record_id_pct | Percentage of rows with a populated record_id. |
source_uri_pct | Percentage of rows with a populated source_uri. |
source_system_pct | Percentage of rows with a populated source_system. |
chunk_id_pct | Percentage of rows with a populated chunk_id. |
occurred_at_pct | Percentage of rows with a populated occurred_at. |
h3_index_pct | Percentage of rows with a populated h3_index. |
mmsi_pct | Percentage of rows with a populated mmsi. |
stats_refreshed_at | When PostgreSQL last ran ANALYZE on the table. |
Coverage percentages are based on the most recent ANALYZE run. If you need sub-percentage-point accuracy, run ANALYZE on the target table first:analyze ais_positions;
select * from codex_compliance_summary where table_name = 'ais_positions';
codex_backfill_progress
Shows the status of h3_index and mmsi backfill jobs over the last seven days. Each row represents one table with aggregated run counts, rows updated, and a stall indicator.
select *
from codex_backfill_progress;
| Column | Description |
|---|
table_name | The table being backfilled. |
run_count | Number of backfill runs in the last seven days. |
total_rows_updated | Total rows updated across all runs. |
last_run_at | Timestamp of the most recent completed run. |
since_last_run | Interval since the last run (useful for spotting gaps). |
last_cursor | The cursor position where the last run stopped. |
status | running if the last run was within five minutes, stalled otherwise. |
Use this view to detect stuck backfills before they fall behind:
select table_name, status, since_last_run
from codex_backfill_progress
where status = 'stalled';
Vessel-level coverage for port events
The port_events table now includes a dedicated mmsi column, completing the full APRS envelope for port-level activity data. This means you can:
- Join port events to vessel records — link arrivals, departures, and berth assignments directly to a vessel’s MMSI without going through a separate lookup table.
- Run spoofing-detection queries — the
mmsi column is indexed (partial index on non-null values), so queries that cross-reference vessel identity against port activity are fast even at scale.
- Track
mmsi coverage in the dashboard — the codex_compliance_summary view reports mmsi_pct for every tracked table, including port_events.
-- Find port events for a specific vessel
select *
from port_events
where mmsi = '211331640'
order by occurred_at desc
limit 20;
CLI compliance checker
The codex.cli check command validates APRS envelope compliance for any table — or all tables at once — from the command line. It samples recent rows and runs nine validation checks covering identity, provenance, temporal ordering, and access-tier fields.
Running a check
Check a single table:
python -m groundswell_collector.codex.cli check civic_records
Check all tables and write a JSON report:
python -m groundswell_collector.codex.cli check all --limit 500 --report /tmp/aprs.json
| Flag | Default | Description |
|---|
--limit | 1000 | Number of most-recent rows to sample per table. |
--report | — | Path to write a JSON report file. |
What gets checked
Each sampled row is validated against nine checks:
| Check | Severity | Rule |
|---|
record_id | critical | Must be a valid APRS URN (urn:aprs:record:...). |
source_uri | critical | Must be non-null. |
ingested_at | critical | Must be valid ISO 8601. |
schema_version | critical | Must match aprs.{domain}/{semver}. |
normalization_version | warning | Must be semver. |
occurred_at | warning | Should be set when semantically applicable. |
temporal_order | critical | occurred_at must not be after ingested_at. |
effective_order | critical | effective_from must not be after effective_to. |
acl_tier | warning | Must be research, commercial, or internal. |
Per-table column introspection
The compliance checker dynamically discovers which envelope columns each table actually has before running checks. This means:
- Tables that use a timestamp column other than
created_at for ordering (such as timestamp on AIS positions, started_at on events, loaded_at on building permits, or updated_at on vessels) are handled correctly.
- Optional envelope fields like
occurred_at, effective_from, and effective_to are only validated if the table has those columns. A missing optional column is not treated as non-compliant.
- The checker exits non-zero if any table has critical issues affecting more than 5% of sampled rows, making it suitable for CI pipelines.
The compliance checker runs automatically in CI on every pull request that touches data collectors or migrations. You can also run it locally to validate your changes before pushing.
Reading the output
The checker prints a formatted report with pass, warning, and fail indicators for each table:
- Pass — all critical checks above 95% compliance.
- Warning — non-critical checks below threshold, but no critical failures.
- Fail — one or more critical checks below 95% compliance.
The optional --report JSON file includes per-table compliance rates and individual issue details, suitable for dashboards or alerting integrations.
When to use it
- During APRS rollout — monitor adoption as backfill jobs work through historical data. Commodity tables (
conab_crop_reports, antt_road_freight, cepea_commodity_prices) and maritime tables (antaq_vessel_calls, usace_lock_passages) now populate envelope fields at write time, so new rows appear as compliant immediately.
- Before an audit — confirm that all evidence tables are fully envelope-compliant before generating an audit export.
- Troubleshooting stalled backfills — if a table’s percentage hasn’t changed in several days and the “Last backfill” timestamp is stale, the backfill job for that table may need attention. Query
codex_backfill_progress to check whether the job is running or stalled.
- Programmatic alerting — query
codex_compliance_summary or codex_backfill_progress from your monitoring tools to trigger alerts when coverage drops or backfills stall.