Skip to main content
Codex datasets are designed so any two can join without custom wrangling. This page documents the keys that make that possible, which datasets carry which keys, and how to use them.

The keys

KeyTypeFormatUsed for
record_idURNurn:aprs:record:{namespace}:{source_system}:{local_id}Stable identity across re-ingests. Primary key within a dataset.
chunk_idURNurn:aprs:chunk:{hex16}Vector-index key. Deterministic SHA-256 of record_id + section.
h3_indexstringH3 resolution-8 cell (e.g. 88283082b9fffff)Spatial join across any geospatial datasets.
event_idUUIDStandard UUIDv4Joins Events Timeline to any dataset that emits events.
jurisdiction_slugstring{city-slug}-{state-abbr} (e.g. philadelphia-pa)Civic-layer joins across city, county, and state boundaries.
mmsistring9-digit Maritime Mobile Service IdentityAIS position to vessel joins. Not stable across re-registrations — use imo when you need permanence.
imointeger7-digit IMO numberVessel identity across ownership, flag, and name changes. Primary vessel key.
poi_idURNurn:aprs:record:poi:{source_system}:{local_id}POI joins across datasets.
parcel_idstring{jurisdiction_slug}:{parcel-number}Parcel-level real-estate joins.
entity_urnURNurn:aprs:entity:{type}:{canonical-id}Entity resolution across sources (person, company, agency, vessel, etc.).

Key availability by dataset

Dataseth3_indexevent_idjurisdiction_slugmmsiimopoi_idparcel_identity_urn
Civic Intelligenceyesyesyesopt.yes
Events TimelineyesPKyesopt.opt.opt.opt.yes
Urban Signal GridPKyes
POI IntelligenceyesyesPK
AIS Maritimeyesyesyesyes
LEHD Commuter Flowsyesyes
Permit Signalsyesyesyesyesyes
PK = primary key. opt. = emitted when the relevant entity is known. = not applicable.

Join examples

Civic decisions in high-signal cells

Join Civic Intelligence to Urban Signal Grid via h3_index to find recent zoning decisions in high-scoring cells:
SELECT
  cells.h3_index,
  cells.composite_score,
  civic.event_type,
  civic.occurred_at,
  civic.litigation_risk_score
FROM codex.urban_signal_grid cells
JOIN codex.civic_intelligence civic USING (h3_index)
WHERE civic.event_type IN ('zoning_vote', 'rezoning_hearing')
  AND civic.occurred_at >= '2026-01-01'
  AND cells.composite_score > 0.7;

Full event detail for a jurisdiction

Join Events Timeline to Civic Intelligence via event_id to get entities, blockers, and hostility scores for civic events:
SELECT
  e.event_id,
  e.occurred_at,
  e.category,
  c.entities_extracted,
  c.blockers,
  c.hostility_index
FROM codex.events_timeline e
JOIN codex.civic_intelligence c USING (event_id)
WHERE e.jurisdiction_slug = 'philadelphia-pa'
  AND e.occurred_at >= '2026-03-01';

Vessel positions around dark events

Join AIS Maritime to Events Timeline via event_id to find vessels and their positions around Overwatch dark events:
SELECT
  e.event_id,
  e.occurred_at,
  ais.mmsi,
  ais.imo,
  ais.lat,
  ais.lng,
  ais.speed
FROM codex.events_timeline e
JOIN codex.ais_maritime ais USING (event_id)
WHERE e.category = 'dark_event'
ORDER BY e.occurred_at DESC;

Commuter inflow to high-signal cells

Join LEHD Commuter Flows to Urban Signal Grid via h3_index:
SELECT
  g.h3_index AS destination_h3,
  g.composite_score,
  SUM(l.worker_count) AS inbound_workers
FROM codex.urban_signal_grid g
JOIN codex.lehd_commuter_flows l
  ON g.h3_index = l.destination_h3
GROUP BY g.h3_index, g.composite_score;

Joins that will not work

Attempted joinWhy it failsUse instead
lat, lng equalityFloating-point equality is unsafe; sources round differentlyh3_index
vessel_nameNames change with ownership and flag. Not unique.imo
Source-native IDs across datasetsDifferent sources use different IDs for the same entityentity_urn
Date-only joinsTime zones and publication lag make cross-dataset alignment unreliableJoin on key + temporal filter (e.g. ABS(a.occurred_at - b.occurred_at) < interval '7 days')

Key construction rules

h3_index

Always resolution 8. Stored as the canonical H3 hexadecimal string (e.g. 88283082b9fffff). When a record covers an area, the dataset either emits a separate row per H3 cell or an array column h3_indexes — the choice is documented in each dataset’s schema.

jurisdiction_slug

Lowercase, hyphen-separated:
  • Cities: {city}-{state-abbr} (e.g. philadelphia-pa, san-francisco-ca)
  • Counties: {county}-county-{state-abbr} (e.g. harris-county-tx)
  • States: {state-full-name} (e.g. texas)
  • Special districts: {name}-{state-abbr} (e.g. port-authority-ny)

mmsi vs. imo

mmsi is the runtime identifier on every AIS position — use it for linking positions to vessel records during ingest. imo is the permanent vessel identifier (IMO A.600) that survives ownership, flag, and name changes. Use imo as the primary vessel key when available. Vessels without an IMO number (some fishing vessels and small craft) carry has_imo=false.

entity_urn

Format: urn:aprs:entity:{type}:{canonical-id} where type is one of person, company, agency, trust, vessel, or nonprofit. Two source records share the same entity_urn when Codex entity resolution determines they refer to the same real-world entity.