The LEHD Commuter Flows dataset contains ~454K origin-destination pairs derived from Census Bureau LODES v8 (Longitudinal Employer-Household Dynamics). Each pair represents a commute flow between two H3 resolution-8 cells, enriched with income breakdowns, NAICS sector counts, and Codex-computed accessibility indices.
Every record inherits the full APRS envelope (record_id, chunk_id, bitemporal fields, confidence_score, provenance) and carries the join keys documented below.
Dataset-specific fields
Flow identifiers
| Field | Type | Nullable | Description |
|---|
origin_h3 | string | no | H3 resolution-8 cell of the workplace (derived from LODES w_geocode). |
destination_h3 | string | no | H3 resolution-8 cell of the residence (derived from LODES h_geocode). |
origin_block_fips | string | no | Census block FIPS code (15-character) for the workplace. Retained as a Census-native identifier. |
destination_block_fips | string | no | Census block FIPS code for the residence. |
metro_slug | string | yes | Metro area identifier (derived from metro FIPS). |
Worker counts
| Field | Type | Nullable | Description |
|---|
worker_count | integer | no | Total number of jobs (LODES S000). |
worker_count_lt30k | integer | no | Jobs with earnings < $1,250/month (LODES SE01). |
worker_count_30to60k | integer | no | Jobs with earnings 1,251–3,333/month (LODES SE02). |
worker_count_gt60k | integer | no | Jobs with earnings > $3,333/month (LODES SE03). |
income_band | enum | no | Codex-derived dominant income band: low, mid, or high. |
job_sector_naics | string | yes | NAICS sector code with aggregated counts (LODES SI01–SI03). |
Codex enrichments
| Field | Type | Nullable | Description |
|---|
accessibility_index | float [0,1] | no | Codex-computed accessibility score per origin-destination H3 pair. Higher values indicate better transit and commute options. |
distance_km | float | no | Centroid-to-centroid distance in kilometers between origin and destination H3 cells. |
h3_neighbor_rank | integer | no | k-ring distance from origin to destination (0 = same cell, 1 = immediate neighbor, etc.). |
lehd_year | integer | no | Reference year for the LODES data (e.g. 2023). |
Income bands
The income_band field is derived from the LODES earnings breakdown:
| Band | Criterion |
|---|
low | Plurality of workers earn < $1,250/month |
mid | Plurality of workers earn 1,251–3,333/month |
high | Plurality of workers earn > $3,333/month |
Accessibility index
The accessibility_index is a Codex-computed score that factors in transit coverage, commute distance, and commute volume between the origin and destination cells. It is useful for identifying well-connected corridors versus underserved commute routes.
Census LODES data includes noise infusion to protect respondent privacy. Codex preserves this noise as-is — small counts (under ~10 workers) may not reflect exact flows.
Join keys
| Key | Presence | Notes |
|---|
record_id | always | APRS URN |
chunk_id | always | Deterministic from record_id |
origin_h3 | always | Join with Urban Signal Grid, POI Intelligence, or any H3-indexed dataset |
destination_h3 | always | Same as above |
origin_block_fips | always | Census-native identifier for block-level joins |
destination_block_fips | always | Same as above |
metro_slug | often | Metro area identifier |
Example query
Find the highest-volume commute corridors into a downtown cell:
SELECT
origin_h3,
destination_h3,
worker_count,
income_band,
accessibility_index,
distance_km
FROM read_parquet('lehd-commuter-flows-2026-04.parquet')
WHERE destination_h3 = '88283082b9fffff'
AND worker_count >= 50
ORDER BY worker_count DESC
LIMIT 20;
Known limitations
- LODES data lags by 2–3 years. The
lehd_year field indicates the reference year — it does not reflect current conditions.
- Census noise infusion means small cell-pair flows (under ~10 workers) carry significant uncertainty.
- Block-level granularity is collapsed to H3 resolution 8. Multiple Census blocks may map to the same H3 cell.
metro_slug is null for flows in rural areas outside defined metro boundaries.