Choose your tier
| Research | Commercial | Enterprise | |
|---|---|---|---|
| Access | Free on Hugging Face | Signed R2 download URL | Signed R2 download URL |
| Records | 100K-record stratified sample per dataset | Full dataset | Full dataset |
| Formats | Parquet | Parquet, CSV, JSON Lines, GeoParquet | All formats + Markdown-KV |
| Snapshots | Latest only | Monthly immutable snapshots | Monthly immutable snapshots |
| Entity graph | — | — | Full graph export (details) |
| License | CC-BY-4.0 (attribution required) | Commercial license, per dataset | Commercial license, all datasets |
| Price | Free | $299/dataset/month | Contact sales |
Download a dataset
Research tier — Hugging Face
Browse the Axiom AI organization on Hugging Face and download any dataset directly. Each dataset includes a README with schema documentation and a 100K-record stratified sample in Parquet format.
Commercial or Enterprise tier
Purchase a license at axiomcodex.io. After checkout, your license key and signed download URL are emailed to the address used at checkout — delivery normally lands within a minute. The URL points to a monthly Parquet snapshot on Cloudflare R2.
If the email doesn’t arrive, your license key is also stored on the subscription itself in Stripe. Use the Manage subscription link in the original receipt to open the Stripe customer portal and surface the key, or contact support@axiomancer.io and we’ll resend.
Run your first query
Load any Codex Parquet file into DuckDB, Pandas, Spark, or your preferred tool. Every dataset uses the same APRS envelope, so once you learn one, you know them all.DuckDB
Python (Pandas)
Join two datasets
Every Codex dataset shares a common set of join keys. Join Civic Intelligence to Urban Signal Grid viah3_index without any wrangling:
Use the LLM-ready surface
Every dataset ships allm_text column in Markdown-KV format, optimized for RAG pipelines and LLM reasoning. Feed it directly into your retrieval system or prompt:
_chunks Parquet file for each dataset, ready for vector indexing.
Next steps
Data catalog
Browse all eight datasets with record counts, formats, and tags.
Normalization standard
The APRS contract every record satisfies — field definitions, versioning, and conformance.
Join keys
The registry of shared keys that make cross-dataset joins work.
Bitemporal fields
Per-dataset reference for every temporal field, so you always know which clock a timestamp is on.