Fallback routing & watching your spend
Try cheap first, escalate only when needed — and watch the meter
Once switching models is one parameter, a powerful pattern follows: route a request to a fast, cheap (or free) model first, and only fall back to an expensive flagship when the cheap model isn't good enough. OpenRouter also gives you one unified usage dashboard across every provider, so you can see cost per model in one place rather than logging into each provider separately.
- 1 Open your OpenRouter usage dashboard on openrouter.ai and note your current credits and recent spend.
- 2 Send a batch of several requests on a
:freemodel (e.g.google/gemma-4-31b-it:free) — for instance the same question asked a few different ways. - 3 Send one harder request on the paid model
anthropic/claude-fable-5. - 4 Return to the usage dashboard and find the new entries. Confirm the free calls show $0 and the paid call shows a small charge at the provider's per-token rate.
- 5 Reason about a fallback rule for your own work: which requests are fine on a free model, and which justify escalating to the paid one?
You can see, in one dashboard, that your `:free` calls cost $0 and your paid call cost a small amount — and you have stated a rule for when a request should escalate from cheap to flagship.
Routing, latest-aliases, and one place to watch cost
Fallback routing and the usage dashboard are two halves of cost control: one decides which model handles a request, the other tells you what it cost. Understand the idea before you bake it into a pipeline.
Fallback routing is the pattern of trying a cheap model first and only falling back to a more capable, more expensive model when needed — this keeps most traffic on free or low-cost models and reserves the flagship for the hard cases. OpenRouter also offers latest-alias model ids that auto-resolve to a provider's newest flagship, so a routing rule can stay current without you editing it on every release. And because OpenRouter pre-pays providers from your credits and bills at pass-through rates, the unified usage dashboard shows your true cost per model across every provider in one view.
- ?In your own workflow, what signal would tell you a cheap model's answer is 'not good enough' and the request should fall back to a flagship?
- ?Why is a latest-alias id useful for a long-running pipeline that you don't want to keep editing?
- ?The dashboard shows cost per model. Which one model would you expect to dominate your spend, and how could fallback routing reduce that?
Design a cheap-first routing rule with a cost budget
Bring it together: a written routing policy plus a spend check turns OpenRouter from a single endpoint into a cost-aware backend for your work.
Define and document a cheap-first fallback policy for one of your real workflows, then verify the cost in the usage dashboard.
- 1 Pick a workflow (yours from lesson 01 or another). Write down the rule: which free model handles requests by default, and the condition under which a request escalates to a named paid model.
- 2 Run a representative set of requests through it — most on the free model, a few escalated to the paid one.
- 3 Open the usage dashboard and record the total cost of the run.
- 4 Write two or three sentences: your routing rule, the run's cost, and whether the free-first policy kept spend where you wanted it.
A written cheap-first routing policy (default free model + escalation condition + named paid fallback) and a dashboard-confirmed cost figure for one test run.