I built a 200 line AI router in TypeScript. My monthly bill dropped 41%. — CoPilot Blog
    Neura MarketNeura Market/CoPilot
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityCoPilotCoPilot
    DeepSeekDeepSeekStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityPluginsTrendingGenerate
    CoPilotBlogI built a 200 line AI router in TypeScript. My monthly bill dropped 41%.
    Back to Blog
    I built a 200 line AI router in TypeScript. My monthly bill dropped 41%.
    typescript

    I built a 200 line AI router in TypeScript. My monthly bill dropped 41%.

    GDS K S May 7, 2026
    0 views

    I track my own AI spend across three projects. In March, the line item that grew fastest was not...

    I track my own AI spend across three projects. In March, the line item that grew fastest was not Claude or GPT calls. It was my Cursor seat plus my Copilot seat plus the Anthropic API I was hitting from a personal CLI. Three subscriptions, three meters, and the same Opus tokens billed twice because Cursor was sending the same context to its own backend that I was already passing through to Anthropic directly. The wrappers do not advertise this. The router code is not their product. The product is the convenience of not thinking about which model handles which prompt. You pay the orchestration tax in margin baked into the seat price. I got tired of paying it. So I wrote the router. It is 200 lines of TypeScript. My April bill came in 41% under March on roughly the same volume of work. ## TL;DR | Model | Input $/M tokens | Output $/M tokens | Best for | |-------|------------------|-------------------|----------| | Haiku 4.5 | 0.80 | 4.00 | Lookups, classification, typo fixes | | Sonnet 4.6 | 3.00 | 15.00 | Default coding, refactors, code review | | Opus 4.7 | 5.00 | 25.00 | Multi step planning, architecture | | GPT-5 mini | 0.50 | 2.00 | Cheap classification, embeddings prep | The 41% saving came from one thing: stopping Sonnet from handling tasks that Haiku could finish in a tenth of the cost. Most coding queries are lookups dressed up as questions. Route by intent, not by habit. ## 1. The orchestration tax is real Every wrapper makes the same trade. They pick a model for you, they prepend a system prompt you cannot edit, and they hold a context window you cannot inspect. In return, you do not have to think. The cost of not thinking shows up two ways: 1. The wrapper calls the most expensive model that fits its SLA, because that is what makes the demo look good 2. The wrapper bills you for context it sent on your behalf, including its own system prompt and tool definitions I logged 30 days of Cursor usage against the Anthropic dashboard. Cursor was sending an average of 8,400 input tokens per chat turn. My direct API calls for the same chats averaged 1,900. The 6,500 token delta is Cursor's frame, plus indexing context, plus its agent scaffolding. Useful, but not free. When you build the router yourself, you choose what to send. That is the whole game. ## 2. The 200 line router Here is the file. Drop it in a project, give it your API keys, and it picks a model per request based on rules you control. ```typescript // router.ts import Anthropic from "@anthropic-ai/sdk"; import OpenAI from "openai"; type Intent = "trivial" | "code" | "plan" | "embed"; interface RouteRule { match: (prompt: string) => boolean; intent: Intent; } interface ModelConfig { provider: "anthropic" | "openai"; model: string; maxTokens: number; } const ROUTES: Record<Intent, ModelConfig> = { trivial: { provider: "anthropic", model: "claude-haiku-4-5-20251001", maxTokens: 1024 }, code: { provider: "anthropic", model: "claude-sonnet-4-6", maxTokens: 4096 }, plan: { provider: "anthropic", model: "claude-opus-4-7", maxTokens: 8192 }, embed: { provider: "openai", model: "gpt-5-mini", maxTokens: 512 }, }; const RULES: RouteRule[] = [ { intent: "trivial", match: (p) => p.length < 200 && /\?$/.test(p.trim()) }, { intent: "trivial", match: (p) => /^(what is|define|fix typo|rename)/i.test(p) }, { intent: "plan", match: (p) => /(refactor|design|architect|migrate|plan)/i.test(p) }, { intent: "code", match: (p) => /(```|function |class |const |let )/i.test(p) }, { intent: "embed", match: (p) => p.startsWith("CLASSIFY:") }, ]; function pickIntent(prompt: string): Intent { for (const rule of RULES) { if (rule.match(prompt)) return rule.intent; } return "code"; } const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }); const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); export interface RouteResult { text: string; model: string; inputTokens: number; outputTokens: number; costUsd: number; } const PRICING: Record<string, { in: number; out: number }> = { "claude-haiku-4-5-20251001": { in: 0.8, out: 4 }, "claude-sonnet-4-6": { in: 3, out: 15 }, "claude-opus-4-7": { in: 5, out: 25 }, "gpt-5-mini": { in: 0.5, out: 2 }, }; function priceCall(model: string, inTok: number, outTok: number): number { const p = PRICING[model]; if (!p) return 0; return (inTok * p.in + outTok * p.out) / 1_000_000; } export async function route(prompt: string): Promise<RouteResult> { const intent = pickIntent(prompt); const cfg = ROUTES[intent]; if (cfg.provider === "anthropic") { const r = await anthropic.messages.create({ model: cfg.model, max_tokens: cfg.maxTokens, messages: [{ role: "user", content: prompt }], }); const text = r.content .filter((b) => b.type === "text") .map((b) => (b as { text: string }).text) .join(""); return { text, model: cfg.model, inputTokens: r.usage.input_tokens, outputTokens: r.usage.output_tokens, costUsd: priceCall(cfg.model, r.usage.input_tokens, r.usage.output_tokens), }; } const r = await openai.chat.completions.create({ model: cfg.model, max_tokens: cfg.maxTokens, messages: [{ role: "user", content: prompt }], }); const usage = r.usage ?? { prompt_tokens: 0, completion_tokens: 0 }; return { text: r.choices[0]?.message?.content ?? "", model: cfg.model, inputTokens: usage.prompt_tokens, outputTokens: usage.completion_tokens, costUsd: priceCall(cfg.model, usage.prompt_tokens, usage.completion_tokens), }; } ``` That is it. Two providers, four intents, five rules, and a cost calculator. Use it like this: ```typescript import { route } from "./router"; const out = await route("rename this function from getUser to fetchUser"); console.log(out.model, out.costUsd.toFixed(5)); // claude-haiku-4-5-20251001 0.00012 ``` The rules are deliberately dumb. Length plus regex covers maybe 70% of routing decisions correctly. For the other 30%, override with a prefix: ```typescript await route("[force:opus] design a permissions model for ..."); ``` Add a one liner to `pickIntent` to read the prefix. I left it out to keep the example tight. ## 3. Routing rules that actually work The naive approach is to send a tiny classifier call to a cheap model and have it pick the route. That sounds smart and costs more than it saves, because every request now eats two API calls. The cost of pickIntent must be zero. Five regex rules cover most of my workload: - Short and ends in a question mark: trivial - Starts with "what is", "define", "fix typo", "rename": trivial - Contains "refactor", "design", "architect", "migrate", "plan": plan - Contains code fence or function keyword: code - Starts with "CLASSIFY:" prefix: embed (cheap classifier) Default to code. A wrong route from trivial to code costs maybe 4x more on that one request. A wrong route from code to opus costs 1.6x. Neither is a disaster. The bug to avoid is sending Haiku a multi step plan it cannot hold context for, which means default conservatively. I also log every miss. After two weeks I had a small CSV of "this prompt routed to X but should have been Y". I added two regex rules and the miss rate dropped from 8% to under 2%. ## 4. The 41% number, broken down March bill, no router: | Source | Calls | Spend | |--------|-------|-------| | Cursor seat | n/a | $20 | | Copilot seat | n/a | $10 | | Anthropic direct | 4,200 | $87 | | OpenAI direct | 800 | $14 | | **Total** | | **$131** | April bill, with router (cancelled Cursor, kept Copilot for IDE inline only): | Source | Calls | Spend | |--------|-------|-------| | Cursor seat | n/a | $0 | | Copilot seat | n/a | $10 | | Anthropic via router | 5,100 | $54 | | OpenAI via router | 1,400 | $13 | | **Total** | | **$77** | That is 41% lower on 30% more total calls. The router shifted 62% of calls onto Haiku, which was eating workloads Sonnet had been handling. Average cost per call dropped from $0.024 to $0.013. The Cursor cancel did the headline saving. The router did the smaller, repeating, compounding saving. Both come from the same idea: the wrapper is hiding decisions you could make better yourself. ## 5. What this does not do This is not an agent framework. It does not stream. It does not retry. It does not cache. It does not handle rate limits. It does not do tool use. It does not know about your codebase. Adding any of those takes work. Streaming is two changes. Caching with the Anthropic prompt cache is one extra header on each call. Retries with exponential backoff is 20 lines. Tool use requires schema plumbing you would write anyway. If you need all of that, use a real framework. If you want to stop paying the orchestration tax on 80% of your calls, the 200 lines above will do it. Add the rest as you actually hit each problem. ## Conclusion Wrappers exist because routing AI calls is annoying. It is also the highest leverage thing you can own in your own code. The 200 lines above are not a moat. They are a Tuesday afternoon. The reason to write them is that you cannot improve a bill you cannot see. What is your current ratio of cheap model to expensive model calls? If you do not know, that is the first thing to fix. Wire up cost logging before you wire up the router. The numbers will surprise you. --- **GDS K S** · [thegdsks.com](https://thegdsks.com) · building [Glincker](https://glincker.com) · follow on X [@thegdsks](https://x.com/thegdsks) *The orchestration tax is the part of the AI bill that does not show up on the pricing page.*

    Tags

    typescriptaiwebdevtutorial

    Comments

    More Blog

    View all
    Minimalist EKS: The Easy Waykubernetes

    Minimalist EKS: The Easy Way

    Amazon EKS manages the Kubernetes control plane, but you remain responsible for provisioning the...

    J
    Joaquin Menchaca
    Never forget to enter the Stern Grove lottery again!ai

    Never forget to enter the Stern Grove lottery again!

    Browser automation with Playwright, Python, GitHub Actions, and Entire to auto-enter San Francisco Stern Grove concert lotteries each week!

    L
    Lizzie Siegle
    A Free Screenshot Editor That Never Uploads Your Imagetypescript

    A Free Screenshot Editor That Never Uploads Your Image

    A free screenshot and image editor that runs entirely in your browser. Keeping every edit reversible and handling big phone photos, in plain TypeScript and Canvas2D.

    M
    Martin Stark
    I built a CLI to break my highlights out of Apple Booksshowdev

    I built a CLI to break my highlights out of Apple Books

    A macOS CLI + MCP server that exports Apple Books highlights to Markdown and gives AI assistants direct access to your reading notes.

    A
    Andrey Korchak
    A Developer's Guide to Agent Hooks in Antigravity CLIai

    A Developer's Guide to Agent Hooks in Antigravity CLI

    Motivation To be quite honest, "Hooks"—the shell commands we trigger at specific points...

    T
    Tanaike
    Tactical vs. Strategic Agentic AI Development — A Playbook for Developersagents

    Tactical vs. Strategic Agentic AI Development — A Playbook for Developers

    The Strategic Engineer: Why Writing Code Is No Longer Your Most Valuable Skill ...

    A
    Adewumi Saheed Adewale

    Stay up to date

    Get the latest CoPilot prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for CoPilot and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.