Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.axiomancer.io/llms.txt

Use this file to discover all available pages before exploring further.

A virtual key is a sk-proxy-… token your application uses to call RouteShift. It maps to one or more provider keys (the actual upstream credentials) but never exposes them. Every key has its own controls — model allowlist, rate limits, budget, expiry, and metadata — so you can hand out scoped keys per environment, customer, agent, or experiment without minting upstream credentials.

Per-key controls

ControlBehavior
Allowed modelsWhitelist of model IDs (or * for all). Requests outside the list are rejected with 403.
Expires atCalendar date after which the key is auto-revoked. Existing requests in flight finish; new ones return 401.
RPMRequests-per-minute cap, enforced in an isolated bucket per key. Overrides any team-level RPM.
TPMTokens-per-minute cap, summed across input + output tokens. Bursty long-context calls are throttled before they reach upstream.
Monthly budgetHard USD cap (microcent-precision). Soft-alert threshold (default 80%) fires a notification; reaching 100% blocks new requests until the next monthly reset.
MetadataArbitrary key=value pairs. Used by /billing to break down spend by environment, customer, or any other dimension.
All values can be set when minting the key (Keys → Create key) or edited later from Keys → … → Edit. Edits apply to new requests immediately.

Key lifecycle

Every state change is recorded in Keys → … → Audit log: created, edited, rotated, revoked, expired. Audit events include the actor (user ID), timestamp, and a diff of the changed fields. The log is append-only and exposed via the admin API for backup or SIEM forwarding.

Key rotation with a grace period

When a key is compromised — or you just want to roll credentials on a schedule — use Keys → … → Rotate. RouteShift mints a new sk-proxy-… token, marks the old one as rotated, and keeps it valid for a configurable grace window (default 24 hours). During the grace window, both tokens accept traffic, so you can deploy the new value without downtime. After the window closes, the old token returns 401.

Cost attribution

Per-key spend is computed from the LiteLLM pricing catalog and rolled up nightly. The Billing → Spend breakdown view lets you slice spend by:
  • Virtual key
  • Metadata tag (e.g. env=prod, customer=acme, agent=research)
  • Provider
  • Model
  • Activity category
Each row shows total cost, request count, mean tokens, and a 30-day sparkline.
Tag keys at create time with whatever dimensions you bill on. RouteShift never indexes the metadata for routing — it’s purely for downstream analytics.