Model router
Version: 2025-05-19
Model Router dynamically selects the optimal large language model(LLM) for a specific query or task in real time. By evaluating factors like query complexity, cost, and performance, it efficiently routes requests to the most suitable model, ensuring high quality results while minimizing costs.
In our tests comparing use of Model Router versus use of GPT-4.1 only, we saw up-to 60% cost savings with similar accuracy. Context length for model router is dependent on the underlying model that's being used for each prompt. In 2025-05-19 version, Input size is 200,000 and Output size is 32,768. For more information, reference the Model router documentation
In our tests comparing use of Model Router versus use of GPT-4.1 only, we saw up-to 60% cost savings with similar accuracy. Context length for model router is dependent on the underlying model that's being used for each prompt. In 2025-05-19 version, Input size is 200,000 and Output size is 32,768. For more information, reference the Model router documentation
Model version in the router
Router Version | Model | Model version | Availability | Lifecycle |
---|---|---|---|---|
2025-05-19 | gpt-4.1 | 2025-04-14 | Global standard | General available |
gpt-4.1-mini | 2025-04-14 | Global standard | General available | |
gpt-4.1-nano | 2025-04-14 | Global standard | General available | |
o4-mini | 2025-04-16 | Global standard | General available |
Model Specifications
Context Length1048576
LicenseCustom
Training DataMay 2025
Last UpdatedMay 2025
Input TypeText,Image
Output TypeText
PublisherMicrosoft
Languages1 Language