model-router

model-router

Model router is a deployable AI model that is trained to select the most suitable large language model (LLM) for a given prompt.
Microsoft
Direct from Azure
Version: 2025-11-18

Welcome to the 2025-11-18 version of model router.

Model Router dynamically selects the optimal large language model (LLM) for a specific query or task in real time. By evaluating factors like query complexity, cost, and performance, it efficiently routes requests to the most suitable model, ensuring high quality results while minimizing costs.

This version adds several new capabilities:

  1. Support Global Standard and Data Zone Standard deployments.
  2. Adds support for new models: grok-4, grok-4-1-fast-reasoning, DeepSeek-V3.1, DeepSeek-V3.2, gpt-oss-120b, Llama-4-Maverick-17B-128E-Instruct-FP8, claude-haiku-4-5, claude-sonnet-4-5, claude-opus-4-1, claude-opus-4-6, claude-opus-4-7, gpt-4o, gpt-4o-mini, gpt-5.2, gpt-5.2-chat, gpt-5.3-chat, gpt-5.4-nano, gpt-5.4-mini, gpt-5.4 and gpt-5.5.
  3. Support for agentic scenarios including tools so you can now use it in the Foundry Agent service.
  4. Quick deploy or Custom deploy with routing mode an

Quick facts

Model providerMicrosoft
TypeChat completion
LifecycleGenerally available (GA)
Input typetext, image
Output typetext
Context window1048.576k
Token limits32768 output