All AI modelsBrowse our comprehensive collection of AI models from Azure AI Foundry
Total Models: 11234
gpt-4o-transcribe-diarize
gpt-4o-transcribe-diarize

A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts; now equipped with diarization support aka identifying different speakers through the transcription.

speech-to-text
sora-2
sora-2

Sora 2 in Azure AI Foundry isn’t just another video generation tool; it’s a creative powerhouse, seamlessly integrated into a platform built for innovation, trust, and scale.

video-generation
gpt-5-pro
gpt-5-pro

gpt-5-pro uses more compute to think harder and provide consistently better answers.

chat-completion
gpt-audio-mini
gpt-audio-mini

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
gpt-realtime-mini
gpt-realtime-mini

gpt-realtime-mini is a smaller version of gpt-realtime S2S (speech to speech) model built on chive architecture. This model excels at instruction following and is optimized for cost efficiency.

audio-generation
grok-4
grok-4

Grok 4 is the latest reasoning model from xAI with advanced reasoning and tool-use capabilities, enabling it to achieve new state-of-the-art performance across challenging academic and industry benchmarks.

chat-completion
grok-4-fast-reasoning
grok-4-fast-reasoning

Grok 4 Fast is an efficiency-focused large language model developed by xAI, pre-trained on general-purpose data and post-trained on task demonstrations and tool use, with built-in safety features including refusal behaviors, a fixed system prompt enforcing

chat-completion
grok-4-fast-non-reasoning
grok-4-fast-non-reasoning

Grok 4 Fast is an efficiency-focused large language model developed by xAI, pre-trained on general-purpose data and post-trained on task demonstrations and tool use, with built-in safety features including refusal behaviors, a fixed system prompt enforcing

chat-completion
gpt-5-codex
gpt-5-codex

gpt-5-codex is designed for steerability, front end development, and interactivity.

responses
DeepSeek-V3.1
DeepSeek-V3.1

DeepSeek-V3.1 is a hybrid model that enhances tool usage, thinking efficiency, and supports both thinking and non-thinking modes via chat template switching

chat-completion
grok-code-fast-1
grok-code-fast-1

Grok Code Fast 1 is a fast, economical AI model for agentic coding, built from scratch with a new architecture, trained on programming-rich data, and fine-tuned for real-world coding tasks like bug fixes and project setup.

chat-completion
gpt-audio
gpt-audio

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
gpt-realtime
gpt-realtime

A new S2S (speech to speech) model with improved instruction following.

audio-generation
gpt-5
gpt-5

gpt-5 is designed for logic-heavy and multi-step tasks.

chat-completion
gpt-5-mini
gpt-5-mini

gpt-5-mini is a lightweight version for cost-sensitive applications.

chat-completion
gpt-5-nano
gpt-5-nano

gpt-5-nano is optimized for speed, ideal for applications requiring low latency.

chat-completion
gpt-5-chat
gpt-5-chat

gpt-5-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.

chat-completion
FLUX-1.1-pro
FLUX-1.1-pro

Generate images with amazing image quality, prompt adherence, and diversity at blazing fast speeds. FLUX1.1 [pro] delivers six times faster image generation and achieved the highest Elo score on Artificial Analysis benchmarks when launched, surpassing all

text-to-image
FLUX.1-Kontext-pro
FLUX.1-Kontext-pro

Generate and edit images through both text and image prompts. FLUX.1 Kontext is a multimodal flow matching model that enables both text-to-image generation and in-context image editing. Modify images while maintaining character consistency and performing l

text-to-image
image-to-image
o3-pro
o3-pro

The o3 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers.

responses
codex-mini
codex-mini

codex-mini is a fine-tuned variant of the o4-mini model, designed to deliver rapid, instruction-following performance for developers working in CLI workflows. Whether you're automating shell commands, editing scripts, or refactoring repositories, Codex-Min

responses
DeepSeek-R1-0528
DeepSeek-R1-0528

The DeepSeek R1 0528 model has improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

chat-completion
sora
sora

An efficient AI solution to generate videos

video-generation
grok-3
grok-3

Grok 3 is xAI's debut model, pretrained by Colossus at supermassive scale to excel in specialized domains like finance, healthcare, and the law.

chat-completion
grok-3-mini
grok-3-mini

Grok 3 Mini is a lightweight model that thinks before responding. Trained on mathematic and scientific problems, it is great for logic-based tasks.

chat-completion
model-router
model-router

Model router is a deployable AI model that is trained to select the most suitable large language model (LLM) for a given prompt.

chat-completion
o3
o3

o3 includes significant improvements on quality and safety while supporting the existing features of o1 and delivering comparable or better performance.

chat-completion
o4-mini
o4-mini

o4-mini includes significant improvements on quality and safety while supporting the existing features of o3-mini and delivering comparable or better performance.

chat-completion
MAI-DS-R1
MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to fill in information gaps in the previous version of the model and improve its harm protections while maintaining R1 reasoning capabilities.

chat-completion
gpt-image-1
gpt-image-1

An efficient AI solution for diverse text and image tasks, including text to image, image to image, inpainting, and prompt transformation.

text-to-image
gpt-4.1
gpt-4.1

gpt-4.1 outperforms gpt-4o across the board, with major gains in coding, instruction following, and long-context understanding

chat-completion
gpt-4.1-mini
gpt-4.1-mini

gpt-4.1-mini outperform gpt-4o-mini across the board, with major gains in coding, instruction following, and long-context handling

chat-completion
gpt-4.1-nano
gpt-4.1-nano

gpt-4.1-nano provides gains in coding, instruction following, and long-context handling along with lower latency and cost

chat-completion
mistral-medium-2505
mistral-medium-2505

Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.

chat-completion
image-classification
EvoDiff
EvoDiff

Microsoft Research's EvoDiff is a diffusion modeling framework capable of generating highfidelity, diverse, and novel proteins with the option of conditioning according to sequence constraints. Because it operates in the universal protein design space, EvoDiff can unconditionally sample diverse str

protein-sequence-generation
Phi-4-reasoning
Phi-4-reasoning

State-of-the-art open-weight reasoning model.

chat-completion
Phi-4-mini-reasoning
Phi-4-mini-reasoning

Lightweight math reasoning model optimized for multi-step problem solving

chat-completion
Llama-4-Scout-17B-16E-Instruct
Llama-4-Scout-17B-16E-Instruct

Llama 4 Scout 17B 16E Instruct is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.

chat-completion
Llama-4-Maverick-17B-128E-Instruct-FP8
Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B

chat-completion
cohere-command-a
cohere-command-a

Command A is a highly efficient generative model that excels at agentic and multilingual use cases.

chat-completion
embed-v-4-0
embed-v-4-0

Embed 4 transforms texts and images into numerical vectors

embeddings
summarization
gpt-4.5-preview
gpt-4.5-preview

the largest and strongest general purpose model in the gpt model family up to date, best suited for diverse text and image tasks.

chat-completion
o3-mini
o3-mini

o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.

chat-completion
DeepSeek-V3-0324
DeepSeek-V3-0324

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.

chat-completion
Llama-4-Scout-17B-16E
Llama-4-Scout-17B-16E

Llama 4 Scout 17B 16E is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.

chat-completion
gpt-4o-mini-tts
gpt-4o-mini-tts

An advanced text-to-speech solution designed to convert written text into natural-sounding speech.

text-to-speech
gpt-4o-transcribe
gpt-4o-transcribe

A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts.

speech-to-text
gpt-4o-mini-transcribe
gpt-4o-mini-transcribe

A highly efficient and cost effective speech-to-text solution that deliverables reliable and accurate transcripts.

speech-to-text
DeepSeek-V3
DeepSeek-V3

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

chat-completion
DeepSeek-R1
DeepSeek-R1

DeepSeek-R1 excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.

chat-completion
computer-use-preview
computer-use-preview

computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.

responses
1