The Permit Signals dataset contains ~2.1M building, demolition, zoning, and construction permits across U.S. jurisdictions. Each permit is enriched with LLM-classified scope extraction, parcel linkage, and developer entity resolution. Permits are modeled as cascading lifecycle events that link back to the Events Timeline.
Every record inherits the full APRS envelope (record_id, chunk_id, bitemporal fields, confidence_score, provenance) and carries the join keys documented below.
Dataset-specific fields
| Field | Type | Nullable | Description |
|---|
event_id | UUID | no | Links to the corresponding event in Events Timeline. |
source_ref | string | no | Jurisdiction-native permit number. |
permit_type | enum | no | Normalized permit classification. See permit types. |
event_type | string | no | Lifecycle stage (e.g. permit.filed, permit.approved). See lifecycle. |
occurred_at | timestamptz | no | Timestamp of this lifecycle stage. |
effective_from | timestamptz | yes | Start of permit validity period. |
effective_to | timestamptz | yes | End of permit validity period. |
issuing_agency | string | no | Name of the issuing government agency. |
jurisdiction_slug | string | no | Civic jurisdiction identifier. |
address | string | yes | Property address. |
parcel_id | string | yes | Assessor parcel identifier. |
h3_index | string | yes | H3 resolution-8 cell derived from the property location. |
declared_value | numeric | yes | Dollar value declared on the permit application (USD). |
scope_extract | JSON object | yes | LLM-extracted structured description of the work scope. See scope extraction. |
applicant_entity_urn | string | yes | Entity resolution link to the applicant or developer. |
Permit types
Jurisdiction-native permit type values are normalized to a 10-value enum. The raw source value is preserved in metadata.source_permit_type.
| Type | Description |
|---|
new_construction | New building construction |
addition | Addition to an existing structure |
alteration | Interior or exterior alteration |
demolition | Full or partial demolition |
change_of_use | Change of occupancy or use classification |
zoning | Variance, rezoning, or special exception |
signage | Sign permits |
mechanical | HVAC, plumbing, or electrical work |
fire | Fire alarm or suppression system |
other | Uncategorized permit types |
Permit lifecycle
Each permit progresses through a series of stages, modeled as events linked by metadata.related_event_ids:
permit.filed → permit.approved → permit.completed
↘ permit.denied
↘ permit.cancelled
↘ permit.revoked
↘ permit.expired
Query the full lifecycle chain by following metadata.related_event_ids in the Events Timeline.
The scope_extract field contains a structured JSON object produced by an LLM that parses the free-text work description on the permit application. A typical output:
{
"work_type": "interior_renovation",
"floors_affected": [2, 3],
"units_added": 4,
"square_footage": 12500,
"use_change": "office_to_residential",
"confidence": 0.82
}
Scope extraction is confidence-rated. A confidence value below 0.7 suggests the LLM had difficulty parsing the source text — review the raw permit description before relying on extracted fields.
Join keys
| Key | Presence | Notes |
|---|
record_id | always | APRS URN |
chunk_id | always | Deterministic from record_id |
event_id | always | Links to Events Timeline |
h3_index | often | Null for permits without geocodable addresses |
jurisdiction_slug | always | Civic jurisdiction identifier |
parcel_id | often | Assessor parcel identifier |
applicant_entity_urn | sometimes | Entity resolution link to applicant or developer |
Example query
Find high-value new construction permits filed in a jurisdiction in the last 90 days:
SELECT
source_ref,
permit_type,
event_type,
occurred_at,
declared_value,
scope_extract,
address
FROM read_parquet('permit-signals-2026-04.parquet')
WHERE jurisdiction_slug = 'philadelphia-pa'
AND permit_type = 'new_construction'
AND event_type = 'permit.filed'
AND occurred_at >= now() - interval '90 days'
AND declared_value >= 1000000
ORDER BY declared_value DESC;
Known limitations
- Jurisdiction-native permit types vary widely. The 10-value normalized enum may lose specificity — check
metadata.source_permit_type for the original classification.
scope_extract quality depends on the structure of the source permit text. Jurisdictions with free-form descriptions produce lower-confidence extractions.
parcel_id formats vary by jurisdiction. Use jurisdiction_slug + parcel_id as the composite key when joining with external assessor data.
- Lifecycle chain coverage depends on jurisdictional reporting. Some jurisdictions only publish filed and approved stages.