model-router

Model router is a deployable AI model that is trained to select the most suitable large language model (LLM) for a given prompt.

Microsoft

Direct from Azure

Version: 2025-11-18

Welcome to the 2025-11-18 version of model router.

Model Router dynamically selects the optimal large language model (LLM) for a specific query or task in real time. By evaluating factors like query complexity, cost, and performance, it efficiently routes requests to the most suitable model, ensuring high quality results while minimizing costs.

This version adds several new capabilities:

Support Global Standard and Data Zone Standard deployments.
Adds support for new models: grok-4, grok-4-1-fast-reasoning, DeepSeek-V3.1, DeepSeek-V3.2, gpt-oss-120b, Llama-4-Maverick-17B-128E-Instruct-FP8, claude-haiku-4-5, claude-sonnet-4-5, claude-opus-4-1, claude-opus-4-6, claude-opus-4-7, gpt-4o, gpt-4o-mini, gpt-5.2, gpt-5.2-chat, gpt-5.3-chat, gpt-5.4-nano, gpt-5.4-mini, gpt-5.4 and gpt-5.5.
Support for agentic scenarios including tools so you can now use it in the Foundry Agent service.
Quick deploy or Custom deploy with routing mode an

Quick facts

Model providerMicrosoft

TypeChat completion

LifecycleGenerally available (GA)

Input typetext, image

Output typetext

Context window1048.576k

Token limits32768 output

PricingView pricing

model-router

Quick facts

Quick start