H3 cell index
Locus uses Uber’s H3 hierarchical hexagonal grid for spatial indexing. Every score, metric, and POI lookup resolves to an H3 cell.Resolution
| Resolution | Edge length (km) | Use case |
|---|---|---|
| 6 | 3.7 | Metro-level summaries (San Francisco, Chicago) |
| 8 | 0.46 | Neighborhood-level scoring (default for /api/score) |
| 9 | 0.17 | Block-level analysis (high-res scoring profiles) |
radius parameter — Locus aggregates child cells appropriately.
Per-metro resolution overrides
Some low-density, large-parcel metros are scored at H3 r7 (~5.16 km², ~1280 m radius) instead of the default r8 (~0.74 km², ~490 m radius). At r8 a single cell in these markets often covers only one CRE asset, which leaves signal aggregation dominated by small-n noise. Coarsening to r7 produces enough samples per cell for the score to be statistically meaningful. Metros currently overridden to r7:- Phoenix
- Houston
- Las Vegas
- Dallas
- San Antonio
- Nashville
- Jacksonville
- Oklahoma City
- El Paso
- Fort Worth
cell_scores is tagged with the resolution it was computed at via the resolution_variant column, so downstream consumers (rankings, MAUP ensembles, the explorer) can distinguish r7 from r8 cells. If you join scores across metros, filter or group on resolution_variant rather than assuming a single resolution.
Cell IDs in API responses
Cell IDs are the standard 15-character hex strings produced byh3-js (e.g. 8828308281fffff). Use these as opaque identifiers when caching scores client-side; Locus pins the H3 algorithm version and won’t change cell-ID semantics without a major API version bump.
Scoring profiles
Theprofile parameter on /api/score adjusts signal weights for a use case. Each profile is a fixed weighting — Locus does not auto-tune weights from your historical data (custom-tuned profiles are on the roadmap).
| Profile | Top-weighted signals | Underweighted signals |
|---|---|---|
general | Development pipeline (permits), economic strength (employment, GDP), accessibility (transit) | Amenity demand, demographics, population momentum |
qsr | Foot traffic, demographics (income), competitor density | Permits, real estate |
self_storage | Population density, housing turnover, road accessibility | Foot traffic, business vitality |
retail | Foot traffic, parking availability, complementary tenants | Jobs, permits |
office | Transit access, daytime population, business vitality | Demographics (residential), traffic |
data_center | Broadband, power infrastructure, environmental risk, accessibility | Amenity demand, business vitality |
industrial | Road accessibility, workforce availability, zoning, economic strength | Amenity demand, demographics (residential) |
general profile weights
The general profile is tuned for generic commercial real estate (CRE) price prediction rather than any specific use case. Weights are calibrated against the empirical CRE price-correlation literature — building permit issuance, employment density change, and transit ridership are the three signals with the strongest published correlations and clearest lead times against CRE prices.
| Signal group | Weight | Rationale |
|---|---|---|
developmentPipeline | 0.20 | Building permit issuance has R² ≈ 0.55 against CRE prices on a 6–12 month lead (Conference Board LEI). |
economicStrength | 0.20 | Employment density change (LEHD) shows strong correlation with retail and office CRE on a 3–9 month lead. |
businessVitality | 0.15 | Establishment counts and payroll trends — directional but less validated than the top two. |
accessibility | 0.12 | Transit ridership shows positive correlation with walkable commercial CRE on a 12–24 month lead. |
populationMomentum | 0.10 | Migration and population change — long-lead but noisier in short windows. |
demographics | 0.10 | Income, age mix, and education — slow-moving baseline, not a short-term price driver. |
safetyEnvironment | 0.08 | Crime is the peer-reviewed safety predictor; flood and environmental risk feed in as constraints. |
amenityDemand | 0.05 | Schools, food access, and retail amenities — important for residential but a weak signal for generic CRE. |
qsr, office, industrial, retail, data_center, self_storage) when you have a specific tenant or asset class in mind — those weights reflect industry-specific priorities rather than generic price prediction. The use-case profile weights are unchanged from prior general rebalances.
Development pipeline sub-scores
ThedevelopmentPipeline group rolls up several permit- and construction-derived sub-signals. Each subScores entry returned in the API response carries a name, a 0–100 score, a weight, and a source label.
| Sub-score | Weight | Source | What it measures |
|---|---|---|---|
| Permit Activity | 0.40 | Building Permits | Cell-level permit count and total declared valuation over the trailing 6 months. Falls back to metro-wide averages when the cell itself has zero permits. |
| Construction Detection | 0.20 | Sentinel-2 | Satellite-derived change-detection score for active construction footprints. |
| Land Cover Change | 0.20 | NLCD | Year-over-year change in the share of developed land cover. |
| Opportunity Zone | 0.10 | Federal OZ | Binary flag for federal Qualified Opportunity Zone designation. |
| Permit Velocity | 0.05 | Building Permits Trend | Acceleration or deceleration in permit issuance vs. the prior 6-month window. |
| Permit Volatility | 0.05 | Building Permits Volatility | Run-to-run variance in permit issuance; high volatility lowers the sub-score. |
| Permit Valuation Tier | 0.10 | AXL-5 | LLM-classified valuation tier (transformative, structural, stabilization) for the dominant permit mix in the cell. |
| Permit Scope Quality | 0.10 | AXL-108 | Per-permit scope_type × cost_tier quality signal — see below. |
Permit Scope Quality (AXL-108)
Permit Scope Quality weights every permit in the trailing 6-month window by what the permit is for, not just how many were issued. Each permit’s scope_type and estimated_cost_tier — both extracted by an LLM-based permit classifier from the free-text description — are multiplied to yield a 0–100 contribution, and the cell’s sub-score is the average of those contributions across the window.
Scope-type weights:
scope_type | Weight |
|---|---|
new_construction | 1.0 |
addition | 0.7 |
demolition | 0.5 |
renovation | 0.3 |
repair | 0.1 |
estimated_cost_tier | Multiplier |
|---|---|
| Tier 1 (highest) | 1.5 |
| Tier 2 | 1.0 |
| Tier 3 | 0.5 |
| Tier 4 (lowest) | 0.25 |
new_construction permits in the highest cost tier will trend toward 100; a cell dominated by low-cost repair permits will trend toward 0. The signal is designed to separate cells where permits indicate genuine new development from cells where permits mostly reflect maintenance churn.
When the LLM extractor has not yet annotated any permit in a cell, the sub-score is omitted from subScores and AXL-108 appears in sourcesMissing for the group.
Safety & environment sub-scores
ThesafetyEnvironment group rolls up crime, regulatory hazard maps, realized loss history, environmental burden, air quality, and 311 service-request signals.
| Sub-score | Weight | Source | What it measures |
|---|---|---|---|
| Crime Rate | 0.30 | Local Police | Reported incidents per 1k residents in the trailing window. |
| Flood Risk | 0.20 | FEMA | Forward-looking FEMA FIRM flood-zone designation (X, A, AE, V, etc.). |
| Flood Loss History | 0.15 | AXL-109 (FEMA NFIP) | Realized NFIP claim history for the cell’s nearest zip — see below. |
| Environmental Risk | 0.15 | EPA EJScreen | Composite EJ index (pollution + demographic burden). |
| Natural Hazard Risk | 0.15 | FEMA NRI | National Risk Index composite for 18 natural hazard types. |
| Air Quality | 0.10 | EPA AirNow | Trailing AQI exposure. |
| Complaint Density | 0.05 | 311 Service Requests | Open-complaint volume per cell. |
| Complaint Velocity | 0.05 | 311 Service Requests Velocity | 30-day change in complaint volume. |
| Resolution Time | 0.05 | AXL-6 | Average days-to-close for 311 complaints in the cell. |
| Complaint Trend YoY | 0.05 | AXL-107 (311 YoY) | Year-over-year delta on the cell’s trailing 30-day 311 complaint density — see below. |
Flood Loss History (AXL-109)
Flood Loss History is a realized-loss companion to the regulatory FEMA FIRM Flood Risk sub-score. The two signals answer different questions:
- Flood Risk — What does the regulatory map say this cell is? Forward-looking, redrawn on a multi-year cycle.
- Flood Loss History — What has actually happened here? Backward-looking, derived from FEMA National Flood Insurance Program (NFIP) claims.
A/V FIRM zones — the FIRM map is a snapshot of modeled hazard, while NFIP claims capture realized loss patterns including pluvial flooding, drainage failures, and stormwater backups that the FIRM zone often misses.
The sub-score is computed by:
- Finding the nearest US zip centroid to the cell (single PostGIS nearest-neighbor lookup).
- Joining that zip to the aggregated NFIP claims table (677k+ raw claims rolled up to ~17.8k zip-level records).
- Returning a normalized exposure score in
[0, 1]derived from claim count, repeat-loss policy share, and the most recent claim year. - Inverting to a 0–100 safety contribution: a zip with no historic claims scores 100, a maximum-exposure zip scores 0.
subScores and AXL-109 (FEMA NFIP) appears in sourcesMissing for the group. NFIP coverage is US-only — international cells will always show this source as missing.
Flood Risk (because it sits in a regulatory AE or V zone) and low on Flood Loss History (because NFIP has paid out repeated claims there). Use both sub-scores together when ranking cells for hazard-sensitive use cases — relying on the FIRM zone alone will under-flag the worst pluvial-flood corridors.
Complaint Trend YoY (AXL-107)
Complaint Trend YoY is a directional companion to the volume-based Complaint Density and Complaint Velocity sub-scores. Density measures how loud a cell is right now; velocity measures the 30-day change; the YoY delta measures whether that loudness is rising or falling against the same window one year ago — the slow-moving signal that volume and short-window velocity both miss.
The sub-score is computed in-app from the same service_requests_311 rows the scorer already pulls for density and resolution time, so no additional data source is required:
- Count 311 complaints in the trailing 30 days (
density_30d). - Count 311 complaints in the matching 30-day window one year prior (the 335-to-395-day window).
- Compute the YoY delta as
(current − prior) / max(prior, 1), capped at+5×to bound runaway ratios in cells with a near-zero prior baseline. - Map the delta onto a 0–100 safety contribution:
−1(complaints down 100%) → 100,+3(complaints up 3×) → 0, with values above+3×saturating at 0.
subScores and AXL-107 (311 YoY) appears in sourcesMissing for the group rather than emitting a misleading “improving” signal from a quiet cell with no complaint history. Cells in metros without 311 ingestion will always show this source as missing.
Complaint Density and a high (improving) Complaint Trend YoY is loud-but-getting-quieter, while a cell with low density and a low (deteriorating) trend score is quiet-but-getting-louder — the kind of leading indicator that volume alone obscures.
Confidence semantics
Every Locus response includes aconfidence field (0.0–1.0) representing the ratio of expected data sources that returned data for the location.
- 0.9+ (high) — almost all data sources contributed; treat as authoritative.
- 0.7–0.9 (medium) — significant signals present; treat as directional, not exact.
- 0.5–0.7 (low) — sparse coverage; useful for filtering but not for ranking.
- <0.5 (very low) — Locus returns the score for transparency but it should not be used for production decisions.
Coverage guards on signal groups
Some signal groups include a minimum-source guard to prevent a single universally-available data source from dominating the group score for cells where every other source is missing. When a group’s coverage guard isn’t met, the group emits a neutral no-signal score of 50 and aconfidence of 0, with an empty subScores array — instead of returning a misleadingly high or low score derived from one input.
| Signal group | Minimum sub-signals | Reason |
|---|---|---|
safetyEnvironment | 2 | FEMA flood zone is available for almost every US cell (typically X = 100). Without a guard, cells with no crime, EJScreen, NRI, AQI, NFIP loss history, or 311 data would score 100/100 on safety purely because of flood-zone coverage. |
| All other groups | 1 (no guard) | These groups don’t currently have a high-scoring, universally-available source that would dominate the average. |
safetyEnvironment.score of 50 paired with confidence: 0 as “no safety signal available” — not as “neutral safety.” The 50 is a placeholder so the composite score can still be computed; it is not a measurement. Use the confidence field to filter these cells out of safety-sensitive ranking lists.
Pioneer Signal and early-stage gentrification indicator
Locus includes a Pioneer Signal detection system that identifies early signs of neighborhood transformation. The system tracks a multi-stage cascade that typically precedes gentrification by 12–24 months:- Pioneer businesses — new specialty coffee shops, art galleries, or co-working spaces appear in a previously underserved area.
- Council language shift — city council meeting minutes begin referencing “revitalization,” “mixed-use,” or “transit-oriented development” for the area.
- Permit acceleration — building permit velocity increases, particularly renovation and change-of-use permits.
- Rezoning activity — formal rezoning applications or variance requests are filed.
| Field | Type | Description |
|---|---|---|
esgi.score | float | ESGI score from 0.0 to 1.0 indicating transformation likelihood. |
esgi.stage | string | Current cascade stage: pioneer_businesses, council_language_shift, permit_acceleration, or rezoning_activity. |
esgi.upzoning_probability | float | Probability that the cell will be upzoned within 12 months. |
esgi.litigation_risk | float | Risk of development delays from litigation or community opposition. |
esgi.spatial_spillover | float | Influence from transformation activity in neighboring cells. |
Uncertainty quantification
Scoring responses include uncertainty metadata that helps you assess how much to trust a given score. Locus runs a Monte Carlo simulation (96 samples) with confidence-aware noise injection to produce:- Confidence intervals — geo-conformal prediction intervals at 80% and 90% coverage.
- Sobol sensitivity indices — first-order and total-order indices showing which signal groups contribute most to score variance.
- Epistemic flags — signals where data is sparse or conflicting, flagged for transparency.
- Spatial Sharpe ratio — a risk-adjusted score metric analogous to a financial Sharpe ratio, indicating score stability relative to spatial neighbors.
include=uncertainty to any scoring endpoint:
| Field | Type | Description |
|---|---|---|
uncertainty.ci_80 | array | 80% confidence interval [lower, upper] for the composite score. |
uncertainty.ci_90 | array | 90% confidence interval [lower, upper]. |
uncertainty.spatial_sharpe | float | Risk-adjusted score relative to neighboring cells. Values above 1.0 indicate the score is stable compared to peers. |
uncertainty.sobol_first_order | object | First-order Sobol indices per signal group. Higher values mean that group drives more score variance. |
uncertainty.epistemic_flags | array | Signal groups with sparse or conflicting data. |
Per-cell consensus class
Theuncertainty payload tells you how noisy a score is given the data feeding it. The consensus class answers a different question: would this score still hold if Locus had aggregated to a different grid? It is a per-cell sensitivity check against the Modifiable Areal Unit Problem (MAUP) — the well-known finding that spatial statistics can move materially when the underlying grid changes.
Locus computes a consensus_class for every cell by comparing the cell’s percentile rank within its metro at the default resolution against the percentile rank of the parent (coarser) hex it sits inside. When the two ranks agree the score is robust to grid choice; when they diverge the score is at least partly a grid artifact.
The classification is surfaced on /api/cells/detail (the endpoint that powers the Explorer’s cell-detail panel) as three fields:
| Field | Type | Description |
|---|---|---|
consensus_class | string | null | One of stable_core, ambiguity_shell, stable_non_signal, or null if the cell has not yet been classified. |
support_stability | float | null | Agreement between the cell’s resolution-8 percentile and its resolution-7 parent percentile within the metro. 1.0 = perfect agreement, 0.0 = the two supports disagree completely. |
consensus_computed_at | string | null | ISO-8601 timestamp of the most recent classification run for this cell. |
Class definitions
| Class | When it fires | How to read it |
|---|---|---|
stable_core | support_stability ≥ 0.80 and the cell sits above the bottom quartile of its metro. | The score is robust to grid choice. Treat as a high-confidence ranking. |
ambiguity_shell | 0.20 ≤ support_stability < 0.80. | The cell ranks materially differently when re-aggregated to a coarser grid. Use the score directionally and pair it with neighbors before acting. |
stable_non_signal | The cell is consistently in the bottom quartile of its metro across both supports. | Genuinely quiet. The low score is not a grid artifact — it is the same answer at every resolution. |
consensus_class: null until the classification job has populated its row. Treat null the same way you’d treat confidence: null — render no robustness badge rather than guessing.
Example response
When to use it
- Ranking and shortlisting. Filter to
consensus_class = stable_corewhen you need the most defensible top-N list — those cells survive a grid change. - Risk-flagging the long tail. Cells in
ambiguity_shellare the cells most worth a human review before commitment. They typically sit on tier boundaries where small grid choices flip rankings. - Suppressing false-positive low scores. A
stable_non_signalcell is genuinely low across grids — astable_non_signalpaired with acomposite_scoreof 35 means the cell is reliably quiet, not under-sampled.
Data freshness
| Source | Refresh cadence | Lag |
|---|---|---|
| POI inventory (Google Places) | Weekly | <7 days |
| Building permits (city open data) | Per-city; mostly weekly | 7-30 days depending on city |
| Traffic patterns (state DOTs) | Monthly | 30-60 days |
| Transit ridership (FTA NTD) | Monthly | 60-90 days (FTA publishing lag) |
| USPS vacancy (HUD) | Quarterly | 90-120 days |
| Jobs (BLS QCEW) | Quarterly | 90 days (BLS publishing lag) |
| Demographics (Census ACS) | Annual | 12-18 months |
| Real estate listings | Daily (where available) | 1-3 days |
| Crime (UCR/local) | Monthly | 30-60 days |
freshness metadata in score responses tells you the oldest source feeding a given response — useful for deciding whether to cache.
API stability commitments
- Cell IDs — frozen across major API versions
- Scoring profile names — frozen
- Composite + sub-score scales (0–100) — frozen
- Confidence semantics — frozen
- Sub-score names within a group — may change with 90-day notice
- Underlying data sources — may change without notice (we pick the best available)
- Weight tuning per profile — adjusted quarterly based on backtesting; minor adjustments not announced, major rebalances get a 30-day blog post
profile_version bump in the response. Pin to a specific profile_version if you need replicable scores across time.
What Locus doesn’t include
Locus is non-PII. We do not include:- Individual person data (names, addresses, phone numbers)
- Mobile-device location traces (Locus uses aggregated patterns, not raw movements)
- Customer-specific data unless you explicitly upload it via
/api/data/upload
suppressed: true flag.