Model Routing
The practice of sending different tasks to different AI models based on complexity, cost, and speed requirements.
What It Is
Model routing is the strategy of directing different types of requests to different AI models rather than using a single model for everything. Just as you would not hire a senior architect to hang a picture frame, you do not need the most powerful (and expensive) AI model for simple tasks. A routing system evaluates each request and sends it to the most appropriate model. Simple classification tasks might go to a fast, cheap model like Claude Haiku. Complex analysis or creative writing might go to a more capable model like Claude Opus. The goal is to optimize the balance between quality, speed, and cost across your entire workflow.
Why It Matters
Using one model for everything is either wasteful (paying top dollar for simple tasks) or limiting (using a cheap model for complex work). Model routing lets operators build cost-effective systems that deliver the right quality at each step. As AI costs scale with usage, routing can reduce your bill significantly without sacrificing output quality where it counts. It also improves speed, since smaller models respond faster. For anyone building production AI systems, model routing is how you move from a prototype that works to a system that is sustainable at scale.
In Practice
In a content pipeline, the first step (topic classification) uses Haiku because it is fast and cheap. The second step (research synthesis) uses Sonnet for balanced quality. The final step (editorial review) uses Opus for careful judgment. Each step gets the right model for the job. This routing can be configured in n8n, custom code, or platforms like Vercel AI Gateway that handle routing logic for you.