Top-level archive contents
Per-control structure
Each control gets a folder named by the control ID (e.g.CC6.1 for SOC 2, A.5.16 for ISO 27001). Inside:
status.json
evidence/<evidence-id>.json
Each evidence file is a single observation with full provenance:
raw_response field is the auditor’s friend — it’s the actual API response, unmodified, with the timestamp Codex received it. Auditors trust this because it’s not editorialized; if a question comes up about a specific event, the auditor can match the source_url to the live admin console.
exceptions.md
When a control has documented exceptions (e.g. “this user retained access for 48h post-termination because of a contract negotiation”), they’re recorded as Markdown:
Aggregate exports (exports/)
These are CSVs the auditor can open in Excel/Sheets to scan for patterns. Common asks:
| File | What’s in it |
|---|---|
access-review-<period>.csv | One row per (user, app) — current role, last activity, last review date |
terminations-<period>.csv | One row per termination — HRIS event, IdP deactivation, gap, exceptions |
changes-<period>.csv | One row per merged PR — repo, author, reviewers, merge time, deployed-to-prod time |
incidents-<period>.csv | One row per incident — severity, detected, acknowledged, resolved, post-mortem URL |
vulnerabilities-<period>.csv | One row per finding — source, severity, discovered, triaged, resolved, SLA met (Y/N) |
Manifest
manifest.json is the machine-readable index. Useful when the auditor uses a tool that ingests structured evidence (some now do — common in SOC 2 firms with audit-tech investments):
Format choices auditors notice
A few decisions Codex makes that auditors call out positively:- No PDF-only artifacts — every control has both human-readable status AND machine-readable JSON. Auditors can grep, filter, and aggregate.
- Every evidence has a timestamp from Codex’s collection — not “we generated this in March.” If the auditor questions a specific date, Codex’s collection time is the timestamp of record.
- Source links are deep URLs into the source admin console — the auditor can open Codex’s evidence and Google Workspace side by side to verify.
- Exceptions are first-class — Codex doesn’t hide failed controls, it surfaces them with documented justification. Auditors prefer companies that admit gaps (and remediate) over companies that paper over them.
What’s NOT in the export
- Source documents — your PDFs, contracts, BAA documents are linked from
policies/orcontrols/<id>/exceptions.mdbut not duplicated. Auditors get separate read-only access to those source systems. - Personal data — user names + emails are included where they’re material to the control (e.g. who did what, when). Customer PII is never included unless it’s specifically the audit subject.
- Source code — repo metadata (author, reviewer, merge time) is included; source code itself is not.