The POI Intelligence dataset contains ~89K points of interest across U.S. metros sourced from Foursquare, OpenStreetMap, Google Places, and manual verification. Each POI is enriched with industry classification (NAICS and ISIC), walkability and transit scores, and a pioneer-flag indicator for recently opened businesses in growth areas.
Every record inherits the full APRS envelope (record_id, chunk_id, bitemporal fields, confidence_score, provenance) and carries the join keys documented below.
Dataset-specific fields
| Field | Type | Nullable | Description |
|---|
poi_id | string | no | Primary POI identifier. |
name | string | no | Business or location name. |
category | string | no | Top-level category (e.g. restaurant, retail, office). |
subcategory | string | yes | Refined category (e.g. fast_casual, coworking). |
naics_code | string | yes | 6-digit NAICS industry code (2022 revision). |
isic_code | string | yes | ISIC Rev. 4 international industry code. |
lat | float | no | WGS84 latitude. |
lng | float | no | WGS84 longitude. |
h3_index | string | no | H3 resolution-8 cell. |
address | string | yes | Full postal address. |
phone | string | yes | Contact phone number. |
website | URL | yes | Business website. |
is_pioneer | boolean | no | true when the POI opened in the last 12 months in a growth context. See pioneer classification. |
walk_score | integer [0,100] | yes | Walkability score for the location. |
transit_score | integer [0,100] | yes | Transit accessibility score. |
reviews_sample | JSON array | yes | Up to 5 representative reviews. |
photo_count | integer | yes | Number of photos available from source feeds. |
source_feed | enum | no | Originating source: foursquare, osm, google, manual. |
Pioneer classification
A POI is classified as a “pioneer” when it meets all three criteria:
- Opened within the last 12 months
- Located in a cell with a rising Urban Signal Grid composite score
- Nearby area shows positive net migration and recent construction permits
Pioneer POIs are early indicators of neighborhood transformation. Use the is_pioneer flag to identify emerging commercial corridors before they appear in traditional market reports.
Industry classification
Each POI carries up to three parallel industry taxonomies:
| Taxonomy | Field | Coverage |
|---|
| Codex category | category / subcategory | 100% of records |
| NAICS | naics_code | ~85% of records |
| ISIC | isic_code | ~80% of records |
Use naics_code for cross-joins with OSHA Safety, LEHD Commuter Flows, and other government datasets that use NAICS classification.
Source feeds and licensing
| Source | License | Notes |
|---|
| OpenStreetMap | ODbL-1.0 | Available in all tiers |
| Foursquare | CC-BY-4.0 | Research and Commercial tiers |
| Google Places | Restricted | Commercial tier only |
Multi-source identity resolution maps duplicate entries from different feeds to a single poi_id. Source-native identifiers are preserved in the identifier array within metadata.
Join keys
| Key | Presence | Notes |
|---|
record_id | always | APRS URN |
chunk_id | always | Deterministic from record_id |
poi_id | always | Primary POI identifier |
h3_index | always | H3 resolution-8 spatial key |
naics_code | often | Join with OSHA Safety, LEHD, and other NAICS-indexed datasets |
entity_urn | sometimes | Entity resolution link (null until resolution pipeline runs) |
Example query
Find pioneer restaurants in a target metro area:
SELECT
poi_id,
name,
category,
subcategory,
naics_code,
walk_score,
transit_score,
is_pioneer
FROM read_parquet('poi-intelligence-2026-04.parquet')
WHERE h3_index LIKE '8828308%'
AND is_pioneer = true
AND category = 'restaurant'
ORDER BY walk_score DESC;
Known limitations
reviews_sample contains at most 5 representative reviews per POI. Full review text is not redistributable.
walk_score and transit_score are null for POIs in areas without sufficient data coverage.
- Multi-source resolution means a single real-world business may have been merged from 2–3 source entries. Check
metadata.identifier for source-native IDs.
- Pioneer classification depends on Urban Signal Grid scoring freshness — newly scored cells may take up to one refresh cycle to propagate.