All AI modelsBrowse our comprehensive collection of AI models from Microsoft Foundry
Total Models: 11321
DeepSeek-V3.2
DeepSeek-V3.2

DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance

chat-completion
DeepSeek-V3.2-Speciale
DeepSeek-V3.2-Speciale

DeepSeek-V3.2 Speciale, a model that harmonizes high computational efficiency with superior reasoning and agent performance

chat-completion
gpt-5.2-chat
gpt-5.2-chat

gpt-5.2-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.

chat-completion
responses
gpt-5.2
gpt-5.2

GPT-5.2 is engineered for enterprise agent scenarios—delivering structured, auditable outputs, reliable tool use, and governed integrations.

chat-completion
responses
Cohere-rerank-v4.0-fast
Cohere-rerank-v4.0-fast

Rerank improves search systems by sorting documents based on their semantic similarity to a query

text-classification
Kimi-K2-Thinking
Kimi-K2-Thinking

Kimi K2 Thinking is the latest, most capable version of open-source thinking model

chat-completion
gpt-5.1-codex-max
gpt-5.1-codex-max

gpt-5.1-codex-max is agentic coding model designed to streamline complex development workflows with advanced efficiency

responses
claude-opus-4-5
claude-opus-4-5

Claude Opus 4.5 is Anthropic’s most intelligent model, and an industry leader across coding, agents, computer use, and enterprise workflows. With a 200K token context window and 64K max output, Opus 4.5 is ideal for production code, sophisticated agents, o

messages
claude-sonnet-4-5
claude-sonnet-4-5

Claude Sonnet 4.5 is Anthropic's most capable model for complex agents and an industry leader for coding and computer use.

messages
gpt-5.1
gpt-5.1

gpt-5.1 is designed for logic-heavy and multi-step tasks.

chat-completion
responses
gpt-5.1-codex
gpt-5.1-codex

gpt-5.1-codex is designed for steerability, front end development, and interactivity.

responses
DeepSeek-V3.1
DeepSeek-V3.1

DeepSeek-V3.1 is a hybrid model that enhances tool usage, thinking efficiency, and supports both thinking and non-thinking modes via chat template switching

chat-completion
Mistral-Large-3
Mistral-Large-3

Mistral Large 3 is a state-of-the-art General-purpose Multimodal granular Mixture-of-Experts model with 39B active parameters, 673B total parameters featuring 128 experts per layer and Multi-Latent attention.

chat-completion
gpt-5-chat
gpt-5-chat

gpt-5-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.

chat-completion
responses
claude-haiku-4-5
claude-haiku-4-5

Claude Haiku 4.5 delivers near-frontier performance for a wide range of use cases, and stands out as one of the best coding and agent models – with the right speed and cost to power free products and scaled sub-agents.

messages
model-router
model-router

Model router is a deployable AI model that is trained to select the most suitable large language model (LLM) for a given prompt.

chat-completion
claude-opus-4-1
claude-opus-4-1

Claude Opus 4.1 is an industry leader for coding. It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve.

messages
grok-4
grok-4

Grok 4 is the latest reasoning model from xAI with advanced reasoning and tool-use capabilities, enabling it to achieve new state-of-the-art performance across challenging academic and industry benchmarks.

chat-completion
sora-2
sora-2

Sora 2 in Azure AI Foundry isn't just another video generation tool; it's a creative powerhouse, seamlessly integrated into a platform built for innovation, trust, and scale.

video-generation
embed-v-4-0
embed-v-4-0

Embed 4 transforms texts and images into numerical vectors

embeddings
summarization
gpt-5.1-chat
gpt-5.1-chat

gpt-5.1-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.

chat-completion
responses
gpt-5.1-codex-mini
gpt-5.1-codex-mini

gpt-5.1-codex-mini is designed for steerability, front end development, and interactivity.

responses
grok-4-fast-reasoning
grok-4-fast-reasoning

Grok 4 Fast is an efficiency-focused large language model developed by xAI, pre-trained on general-purpose data and post-trained on task demonstrations and tool use, with built-in safety features including refusal behaviors, a fixed system prompt enforcing

chat-completion
gpt-5-pro
gpt-5-pro

gpt-5-pro uses more compute to think harder and provide consistently better answers.

chat-completion
responses
Llama-4-Maverick-17B-128E-Instruct-FP8
Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B

chat-completion
Llama-4-Maverick-17B-128E-Instruct-FP8
Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B

chat-completion
gpt-5
gpt-5

gpt-5 is designed for logic-heavy and multi-step tasks.

chat-completion
responses
DeepSeek-V3-0324
DeepSeek-V3-0324

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.

chat-completion
gpt-4.1
gpt-4.1

gpt-4.1 outperforms gpt-4o across the board, with major gains in coding, instruction following, and long-context understanding

chat-completion
responses
gpt-4.1-mini
gpt-4.1-mini

gpt-4.1-mini outperform gpt-4o-mini across the board, with major gains in coding, instruction following, and long-context handling

chat-completion
responses
grok-4-fast-non-reasoning
grok-4-fast-non-reasoning

Grok 4 Fast is an efficiency-focused large language model developed by xAI, pre-trained on general-purpose data and post-trained on task demonstrations and tool use, with built-in safety features including refusal behaviors, a fixed system prompt enforcing

chat-completion
gpt-4o-transcribe-diarize
gpt-4o-transcribe-diarize

A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts; now equipped with diarization support aka identifying different speakers through the transcription.

speech-to-text
Flux.1-Kontext-pro
Flux.1-Kontext-pro

Generate and edit images through both text and image prompts. FLUX.1 Kontext is a multimodal flow matching model that enables both text-to-image generation and in-context image editing. Modify images while maintaining character consistency and performing l

text-to-image
image-to-image
gpt-5-codex
gpt-5-codex

gpt-5-codex is designed for steerability, front end development, and interactivity.

chat-completion
responses
Flux-1.1-Pro
Flux-1.1-Pro

Generate images with amazing image quality, prompt adherence, and diversity at blazing fast speeds. FLUX1.1 [pro] delivers six times faster image generation and achieved the highest Elo score on Artificial Analysis benchmarks when launched, surpassing all

text-to-image
o3
o3

o3 includes significant improvements on quality and safety while supporting the existing features of o1 and delivering comparable or better performance.

chat-completion
responses
gpt-realtime-mini
gpt-realtime-mini

gpt-realtime-mini is a smaller version of gpt-realtime S2S (speech to speech) model built on chive architecture. This model excels at instruction following and is optimized for cost efficiency.

audio-generation
gpt-realtime-mini
gpt-realtime-mini

gpt-realtime-mini is a smaller version of gpt-realtime S2S (speech to speech) model built on chive architecture. This model excels at instruction following and is optimized for cost efficiency.

audio-generation
gpt-5-nano
gpt-5-nano

gpt-5-nano is optimized for speed, ideal for applications requiring low latency.

chat-completion
responses
gpt-5-mini
gpt-5-mini

gpt-5-mini is a lightweight version for cost-sensitive applications.

chat-completion
responses
DeepSeek-R1-0528
DeepSeek-R1-0528

The DeepSeek R1 0528 model has improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

chat-completion
grok-3
grok-3

Grok 3 is xAI's debut model, pretrained by Colossus at supermassive scale to excel in specialized domains like finance, healthcare, and the law.

chat-completion
MAI-DS-R1
MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to fill in information gaps in the previous version of the model and improve its harm protections while maintaining R1 reasoning capabilities.

chat-completion
o4-mini
o4-mini

o4-mini includes significant improvements on quality and safety while supporting the existing features of o3-mini and delivering comparable or better performance.

chat-completion
responses
gpt-4.1-nano
gpt-4.1-nano

gpt-4.1-nano provides gains in coding, instruction following, and long-context handling along with lower latency and cost

chat-completion
responses
grok-code-fast-1
grok-code-fast-1

Grok Code Fast 1 is a fast, economical AI model for agentic coding, built from scratch with a new architecture, trained on programming-rich data, and fine-tuned for real-world coding tasks like bug fixes and project setup.

chat-completion
mistral-document-ai-2505
mistral-document-ai-2505

Document conversion to markdown with interleaved images and text

image-to-text
o3-mini
o3-mini

o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.

chat-completion
responses
gpt-audio-mini
gpt-audio-mini

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
gpt-oss-120B
gpt-oss-120B

Push the open model frontier with GPT-OSS models, released under the permissive Apache 2.0 license, allowing anyone to use, modify, and deploy them freely.

chat-completion
grok-3-mini
grok-3-mini

Grok 3 Mini is a lightweight model that thinks before responding. Trained on mathematic and scientific problems, it is great for logic-based tasks.

chat-completion
1