Optimize findings

The Optimize page runs a recurring pass over the last 7 days of request_logs and produces concrete findings — each with an estimated monthly $ impact and a one-click fix path.

Finding types

Finding	What it detects	Typical fix
Oversized system prompt	System prompts above a per-model size threshold that don’t measurably improve output quality.	Trim the prompt or cache the system block.
Duplicate requests	Identical or near-identical prompts repeated within a short window from the same key.	Add idempotency or a client-side cache.
Model mismatch	Tasks classified as `chat` running on frontier models, or `agentic` running on small models.	Re-route via a rule.
Underused budget	Keys with monthly budgets that haven’t been touched in 30+ days.	Lower the budget or revoke.
Stuck in retry loop	Sessions with retry rate > 50% on the same model.	Compare against a stronger model in `/models/compare`.

How estimates are calculated

Each finding ships with a projected monthly $ savings, computed from:

The volume of matching requests in the last 7 days.
The cost delta between the current model and the suggested target (priced from the LiteLLM catalog).
A 30-day projection assuming current traffic patterns hold.

Estimates are conservative — RouteShift floors negatives to zero and ignores findings under $1/month so the page only surfaces work worth doing.

Acting on a finding

Each card has an Apply fix button that drops you into the right surface:

Oversized prompt → links to the offending requests so you can see which prompt template needs trimming.
Duplicate requests → opens the rule editor pre-filled to add a deduplication tag.
Model mismatch → opens /models/compare with both models pre-selected, then opens a rule draft.
Stuck in retry → same as model mismatch.

Findings auto-resolve when the underlying behavior disappears from the trailing 7-day window.

​Finding types

​How estimates are calculated

​Acting on a finding

Finding types

How estimates are calculated

Acting on a finding