All AI modelsBrowse our comprehensive collection of AI models from Azure AI Foundry
Total Models: 11153
sora
sora

An efficient AI solution to generate videos

video-generation
grok-3
grok-3

Grok 3 is xAI's debut model, pretrained by Colossus at supermassive scale to excel in specialized domains like finance, healthcare, and the law.

chat-completion
grok-3-mini
grok-3-mini

Grok 3 Mini is a lightweight model that thinks before responding. Trained on mathematic and scientific problems, it is great for logic-based tasks.

chat-completion
model-router
model-router

An affordable, efficient AI solution for diverse text and image tasks.

chat-completion
o3
o3

o3 includes significant improvements on quality and safety while supporting the existing features of o1 and delivering comparable or better performance.

chat-completion
o4-mini
o4-mini

o4-mini includes significant improvements on quality and safety while supporting the existing features of o3-mini and delivering comparable or better performance.

chat-completion
MAI-DS-R1
MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to fill in information gaps in the previous version of the model and improve its harm protections while maintaining R1 reasoning capabilities.

chat-completion
gpt-image-1
gpt-image-1

An efficient AI solution for diverse text and image tasks, including text to image, image to image, inpainting, and prompt transformation.

text-to-image
gpt-4.1
gpt-4.1

gpt-4.1 outperforms gpt-4o across the board, with major gains in coding, instruction following, and long-context understanding

chat-completion
gpt-4.1-mini
gpt-4.1-mini

gpt-4.1-mini outperform gpt-4o-mini across the board, with major gains in coding, instruction following, and long-context handling

chat-completion
gpt-4.1-nano
gpt-4.1-nano

gpt-4.1-nano provides gains in coding, instruction following, and long-context handling along with lower latency and cost

chat-completion
mistral-medium-2505
mistral-medium-2505

Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.

chat-completion
conversational
image-classification
EvoDiff
EvoDiff

Microsoft Research's EvoDiff is a diffusion modeling framework capable of generating highfidelity, diverse, and novel proteins with the option of conditioning according to sequence constraints. Because it operates in the universal protein design space, EvoDiff can unconditionally sample divers

protein-sequence-generation
Phi-4-reasoning
Phi-4-reasoning

State-of-the-art open-weight reasoning model.

chat-completion
Phi-4-mini-reasoning
Phi-4-mini-reasoning

Lightweight math reasoning model optimized for multi-step problem solving

chat-completion
Llama-4-Scout-17B-16E-Instruct
Llama-4-Scout-17B-16E-Instruct

Llama 4 Scout 17B 16E Instruct is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.

chat-completion
Llama-4-Maverick-17B-128E-Instruct-FP8
Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B

chat-completion
cohere-command-a
cohere-command-a

Command A is a highly efficient generative model that excels at agentic and multilingual use cases.

chat-completion
embed-v-4-0
embed-v-4-0

Embed 4 transforms texts and images into numerical vectors

embeddings
summarization
gpt-4.5-preview
gpt-4.5-preview

the largest and strongest general purpose model in the gpt model family up to date, best suited for diverse text and image tasks.

chat-completion
o3-mini
o3-mini

o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.

chat-completion
DeepSeek-V3-0324
DeepSeek-V3-0324

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.

chat-completion
Llama-4-Scout-17B-16E
Llama-4-Scout-17B-16E

Llama 4 Scout 17B 16E is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.

chat-completion
gpt-4o-mini-tts
gpt-4o-mini-tts

An advanced text-to-speech solution designed to convert written text into natural-sounding speech.

text-to-speech
gpt-4o-transcribe
gpt-4o-transcribe

A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts.

speech-to-text
gpt-4o-mini-transcribe
gpt-4o-mini-transcribe

A highly efficient and cost effective speech-to-text solution that deliverables reliable and accurate transcripts.

speech-to-text
DeepSeek-V3
DeepSeek-V3

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

chat-completion
DeepSeek-R1
DeepSeek-R1

DeepSeek-R1 excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.

chat-completion
computer-use-preview
computer-use-preview

computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.

responses
Phi-4-mini-instruct
Phi-4-mini-instruct

3.8B parameters Small Language Model outperforming larger models in reasoning, math, coding, and function-calling

chat-completion
Phi-4-multimodal-instruct
Phi-4-multimodal-instruct

First small multimodal model to have 3 modality inputs (text, audio, image), excelling in quality and efficiency

chat-completion
Phi-4
Phi-4

Phi-4 14B, a highly capable model for low latency scenarios.

chat-completion
mistral-small-2503
mistral-small-2503

Enhanced Mistral Small 3 with multimodal capabilities and a 128k context length.

chat-completion
Completions
Conversational
Image classification
Question answering
gpt-4o-mini-audio-preview
gpt-4o-mini-audio-preview

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
o1
o1

Focused on advanced reasoning and solving complex problems, including math and science tasks. Ideal for applications that require deep contextual understanding and agentic workflows.

chat-completion
o1-mini
o1-mini

Smaller, faster, and 80% cheaper than o1-preview, performs well at code generation and small context operations.

chat-completion
gpt-4o
gpt-4o

OpenAI's most advanced multimodal model in the gpt-4o family. Can handle both text and image inputs.

chat-completion
gpt-4o-mini
gpt-4o-mini

An affordable, efficient AI solution for diverse text and image tasks.

chat-completion
gpt-4o-audio-preview
gpt-4o-audio-preview

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.

audio-generation
gpt-4o-realtime-preview
gpt-4o-realtime-preview

The gpt4orealtimepreview model introduces a new era in AI interaction by incorporating the new audio modality powered by gpt4o. This new modality allows for seamless speechtospeech and texttospeech applications, providing a richer and more engaging user experience. Engineered for speed and e

audio-generation
financial-reports-analysis-v2
financial-reports-analysis-v2

Adapted AI model for financial reports analysis based on Phi-4

chat-completion
supply-chain-trade-regulations-v2
supply-chain-trade-regulations-v2

Adapted AI model for supply chain trade regulations based on Phi-4

chat-completion
Muse
Muse

Muse is a World and Human Action Model (WHAM), a generative model of gameplay (visuals and/or controller actions).

image-to-image
Cohere-rerank-v3.5
Cohere-rerank-v3.5

Cohere’s Rerank 3.5 provides a significant boost to the relevancy of search results. This AI model, also known as a crossencoder, precisely sorts lists of documents according to their semantic similarity to a provided query. This allows information retrieval systems to go beyond keyword search and

text-classification
Stable-Diffusion-3.5-Large
Stable-Diffusion-3.5-Large

At 8.1 billion parameters, with superior quality and prompt adherence, this base model is the most powerful in the Stable Diffusion family. This model is ideal for professional use cases at 1 megapixel resolution. Stable Diffusion 3.5 Large produces diverse outputs, creating images that are represe

text-to-image
image-to-image
Stable-Image-Ultra
Stable-Image-Ultra

Powered by the advanced capabilities of Stable Diffusion 3.5 Large, Stable Image Ultra sets a new standard in photorealism. Stable Image Ultra is ideal for product imagery in marketing and advertising. It also excels in typography, dynamic lighting, and vibrant color rendering.

text-to-image
Stable-Image-Core
Stable-Image-Core

Leveraging an enhanced version of SDXL, Stable Image Core, delivers exceptional speed and efficiency while maintaining the highquality output synonymous with Stable Diffusion models.

text-to-image
Gretel-Navigator-Tabular
Gretel-Navigator-Tabular

Gretel Navigator Tabular generates productionquality synthetic data optimized for AI and machine learning development from prompts, schema definitions, or seed examples. Unlike singleLLM approaches to data generation, Navigator Tabular employs a compound AI architecture specifically engineered for

chat-completion
data-generation
o1-preview
o1-preview

Focused on advanced reasoning and solving complex problems, including math and science tasks. Ideal for applications that require deep contextual understanding and agentic workflows.

chat-completion
Llama-3.3-70B-Instruct
Llama-3.3-70B-Instruct

Llama 3.3 70B Instruct offers enhanced reasoning, math, and instruction following with performance comparable to Llama 3.1 405B.

chat-completion
1