RouteShift logs every request — input tokens, output tokens, cost, latency, resolved upstream model, and an inferred activity category — intoDocumentation Index
Fetch the complete documentation index at: https://docs.axiomancer.io/llms.txt
Use this file to discover all available pages before exploring further.
request_logs. From there, downstream tables stitch sessions, score turn signals, and roll up per-session metrics for the dashboard.
What gets logged
Every proxied request writes one row containing:- Virtual key ID and provider key ID.
- Requested model + resolved upstream model (after aliases and routing).
- Input tokens, output tokens, total cost (microcents).
- Wall-clock latency, time-to-first-token, and upstream HTTP status.
- Inferred activity category.
- Stitched session ID and turn signals.
- Cache-hit flag (semantic cache hits skip token-counting and write
[]/falsefor turn signals).
Activity categorization
RouteShift classifies each request into one of these categories based on the prompt shape and model behavior:chat— interactive single-turn or short multi-turn dialog.agentic— long tool-use chains, multi-turn with structured output.code— code generation, refactoring, or completion.research— long-context summarization or analysis.embedding— embedding model calls.other— anything that doesn’t match the above.
Session stitching
A session is a sequence of requests from the same virtual key that share contextual signals (continuation of a conversation, samesession_id metadata, contiguous timestamps, similar prompts). RouteShift derives a session ID via deriveSessionId and writes it into request_logs.session_id. The aggregator (session-aggregator.ts) rolls those rows into session_metrics every five minutes, behind a pg_try_advisory_xact_lock so it’s safe to run on replicas.
session_metrics powers:
- Overview KPI — total sessions, mean tokens per session, mean cost per session.
- Analytics by-model table — per-model session counts, one-shot rate, retry rate.
- Activity drill-in — click any row in the activity feed to see all requests in that session.
Turn signals
Each request can be tagged with structured turn signals that describe what happened:edited_paths text[]— file paths the model edited (when integrated with a coding agent).had_bash boolean— whether the turn ran a shell command.
One-shot rate and retry rate
Two derived metrics capture how efficient a session was:- One-shot rate — fraction of sessions where the user got a usable response on the first turn. High one-shot rate means the model is sized correctly for the task.
- Retry rate — fraction of turns where the user re-prompted (manual retry) within a short window. High retry rate is a leading indicator of model mismatch.
GET /api/usage/one-shot and broken down by model on the analytics page.
Model comparison
/models/compare plots two models side by side on the same traffic — same activity categories, same prompt sizes — and shows the cost / latency / one-shot-rate delta. Use it to A/B a candidate downgrade before flipping a routing rule.