Microsoft
MicrosoftProprietary AI models developed by Microsoft, tailored for various enterprise applications and integrated within Azure services.
Total Models: 55
model-router
model-router

An affordable, efficient AI solution for diverse text and image tasks.

chat-completion
MAI-DS-R1
MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to fill in information gaps in the previous version of the model and improve its harm protections while maintaining R1 reasoning capabilities.

chat-completion
EvoDiff
EvoDiff

Microsoft Research's EvoDiff is a diffusion modeling framework capable of generating highfidelity, diverse, and novel proteins with the option of conditioning according to sequence constraints. Because it operates in the universal protein design space, EvoDiff can unconditionally sample divers

protein-sequence-generation
Phi-4-reasoning
Phi-4-reasoning

State-of-the-art open-weight reasoning model.

chat-completion
Phi-4-mini-reasoning
Phi-4-mini-reasoning

Lightweight math reasoning model optimized for multi-step problem solving

chat-completion
Phi-4-mini-instruct
Phi-4-mini-instruct

3.8B parameters Small Language Model outperforming larger models in reasoning, math, coding, and function-calling

chat-completion
Phi-4-multimodal-instruct
Phi-4-multimodal-instruct

First small multimodal model to have 3 modality inputs (text, audio, image), excelling in quality and efficiency

chat-completion
Phi-4
Phi-4

Phi-4 14B, a highly capable model for low latency scenarios.

chat-completion
financial-reports-analysis-v2
financial-reports-analysis-v2

Adapted AI model for financial reports analysis based on Phi-4

chat-completion
supply-chain-trade-regulations-v2
supply-chain-trade-regulations-v2

Adapted AI model for supply chain trade regulations based on Phi-4

chat-completion
Muse
Muse

Muse is a World and Human Action Model (WHAM), a generative model of gameplay (visuals and/or controller actions).

image-to-image
Phi-3-small-8k-instruct
Phi-3-small-8k-instruct

A 7B parameters model, proves better quality than Phi-3-mini, with a focus on high-quality, reasoning-dense data.

chat-completion
Azure-AI-Language
Azure-AI-Language

Azure AI Language Azure AI Language is a cloudbased service designed to help you easily get insights from unstructured text data. It uses a combination of SLMs and LLMs, including taskoptimized decoder models and encoder models, for Language AI solutions. It provides premium quality at an affor

text-analytics
conversational-ai
summarization
Phi-3-vision-128k-instruct
Phi-3-vision-128k-instruct

Model Summary Phi3 Vision is a lightweight, stateoftheart open multimodal model built upon datasets which include synthetic data and filtered publicly available websites with a focus on very highquality, reasoning dense data both on text and vision. The model belongs to the Phi3 model

chat-completion
TamGen
TamGen

The TamGen is a 100 millionparameter model that can generate compounds based on the input protein information. TamGen is pretrained on 10 million compounds from PubChem and finetuned on CrossDocked and PDB datasets. We evaluate TamGen on existing benchmarks and achieve top performance. Furthermor

protein-design
microsoft-Orca-2-7b
microsoft-Orca-2-7b

Orca 2 is a finetuned version of LLAMA2. Orca 2’s training data is a synthetic dataset that was created to enhance the small model’s reasoning abilities. All synthetic training data was moderated using the Microsoft Azure content filters. More details about the model can be found in the [Orca 2 pap

text-generation
Phi-3-medium-4k-instruct
Phi-3-medium-4k-instruct

A 14B parameters model, proves better quality than Phi-3-mini, with a focus on high-quality, reasoning-dense data.

chat-completion
Phi-4-reasoning-plus-onnx
Phi-4-reasoning-plus-onnx

State-of-the-art open-weight reasoning model.

chat-completion
Azure-AI-Translator
Azure-AI-Translator

Azure AI Translator Azure AI Translator, a part of the Azure AI services, is a cloudbased neural machine translation service that enables businesses to translate text and documents across multiple languages in real time and in batches. The service also offers customization options, enabling busi

translation
document-translation
Phi-4-reasoning-generic-cpu
Phi-4-reasoning-generic-cpu

This model is an optimized version of Phi4reasoning to enable local inference on CPUs. This model uses RTN quantization. Model Description Developed by: Microsoft Model type: ONNX License: MIT Model Description: This is a conversion of the Phi4reasoning for local infer

chat-completion
DeepSeek-R1-Distilled-NPU-Optimized
DeepSeek-R1-Distilled-NPU-Optimized

Learn more: \[original model announcement\] DeepSeekR1DistilledNPUOptimized is a downloadable package of DeepSeekR1DistilledQwen1.5B that is specifically optimized for the Neural Processing Unit (NPU). NPU optimized models let develo

chat-completion
MedImageParse3D
MedImageParse3D

Biomedical image analysis is fundamental for biomedical discovery in cell biology, pathology, radiology, and many other biomedical domains. 3D medical images such as CT and MRI play unique roles in clinical practices. MedImageParse 3D is a foundation model for imaging parsing that can jointly co

image-segmentation
BioEmu
BioEmu

Biomolecular Emulator (BioEmu) is a deep learning model that, given a protein sequence, can sample thousands of statistically independent structures from the protein structure ensemble per hour on a single graphics processing unit. By leveraging novel training methods and vast data of protein st

protein-structure-prediction
Phi-3.5-vision-instruct
Phi-3.5-vision-instruct

Refresh of Phi-3-vision model.

chat-completion
Phi-3.5-MoE-instruct
Phi-3.5-MoE-instruct

A new mixture of experts model

chat-completion
Phi-4-reasoning-generic-gpu
Phi-4-reasoning-generic-gpu

This model is an optimized version of Phi4reasoning to enable local inference on GPUs. This model uses RTN quantization. Model Description Developed by: Microsoft Model type: ONNX License: MIT Model Description: This is a conversion of the Phi4reasoning for local infer

chat-completion
Phi-4-mini-reasoning-onnx
Phi-4-mini-reasoning-onnx

Lightweight math reasoning model optimized for multi-step problem solving

chat-completion
microsoft-llava-med-v1.5-mistral-7b
microsoft-llava-med-v1.5-mistral-7b

LLaVAMed v1.5, using mistralai/Mistral7BInstructv0.2 as LLM for a better commercial license Large Language and Vision Assistant for bioMedicine (i.e., “LLaVAMed”) is a large language and vision model trained using a curriculum lear

image-text-to-text
financial-reports-analysis
financial-reports-analysis

Description The adapted AI model for financial reports analysis (preview) is a state\of\the\art small language model (SLM) based on the Phi\3\small\128k architecture, designed specifically for analyzing financial reports. It has been fine\tuned on a few hundred million tokens derived fro

chat-completion
Prov-GigaPath
Prov-GigaPath

Description Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles[^1],[^2],[^3]. Previous models often rely predominantly on tilelevel predictions, which can overlook critical slidelevel context and spatial dependen

image-feature-extraction
microsoft-Orca-2-13b
microsoft-Orca-2-13b

Orca 2 is a finetuned version of LLAMA2. Orca 2’s training data is a synthetic dataset that was created to enhance the small model’s reasoning abilities. All synthetic training data was moderated using the Microsoft Azure content filters. More details about the model can be found in the [Orca 2 pap

text-generation
MedImageParse
MedImageParse

Biomedical image analysis is fundamental for biomedical discovery in cell biology, pathology, radiology, and many other biomedical domains. MedImageParse is a biomedical foundation model for imaging parsing that can jointly conduct segmentation, detection, and recognition across 9 imaging modalities

image-segmentation
microsoft-swinv2-base-patch4-window12-192-22k
microsoft-swinv2-base-patch4-window12-192-22k

The Swin Transformer V2 model is a type of Vision Transformer, pretrained on ImageNet21k with a resolution of 192x192, is introduced in the <a href="https://arxiv.org/abs/2111.09883" target="blank"researchpaper</a titled "Swin Transformer V2: Scaling Up Capacity and Resolution" authored by Liu

image-classification
BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

BiomedCLIP is a biomedical visionlanguage foundation model that is pretrained on PMC15M, a dataset of 15 million figurecaption pairs extracted from biomedical research articles in PubMed Central, using contrastive learning. It uses PubMedBERT as the text encoder and Vision Transformer as the imag

zero-shot-image-classification
Phi-3-medium-128k-instruct
Phi-3-medium-128k-instruct

Same Phi-3-medium model, but with a larger context size for RAG or few shot prompting.

chat-completion
Phi-4-reasoning-cuda-gpu
Phi-4-reasoning-cuda-gpu

This model is an optimized version of Phi4reasoning to enable local inference on CUDA GPUs. This model uses RTN quantization. Model Description Developed by: Microsoft Model type: ONNX License: MIT Model Description: This is a conversion of the Phi4reasoning for local

chat-completion
CxrReportGen
CxrReportGen

Overview The CXRReportGen model utilizes a multimodal architecture, integrating a BiomedCLIP image encoder with a Phi3Mini text encoder to help an application interpret complex medical imaging studies of chest Xrays. CXRReportGen follows the same framework as [MAIRA2](https://www.microsoft

image-text-to-text
microsoft-rad-dino
microsoft-rad-dino

Model Description Model card for RADDINO Model description RADDINO is a vision transformer model trained to encode chest Xrays using the selfsupervised learning method DINOv2. RADDINO is described in detail in [RADDINO: Exploring Scalab

embeddings
microsoft-phi-2
microsoft-phi-2

Microsoft Phi2 The phi2 is a language model with 2.7 billion parameters. The phi2 model was trained using the same data sources as phi1, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed a

text-generation
Azure-AI-Content-Understanding
Azure-AI-Content-Understanding

Azure AI Content Understanding Introduction Azure AI Content Understanding empowers you to transform unstructured multimodal data—such as text, images, audio, and video—into structured, actionable insights. By streamlining content processing with advanced AI techniques like schema extraction

intelligent-content-processing
custom-extraction
image-analysis
text-analysis
video-analysis
Phi-3-mini-128k-instruct
Phi-3-mini-128k-instruct

Same Phi-3-mini model, but with a larger context size for RAG or few shot prompting.

chat-completion
Azure-AI-Vision
Azure-AI-Vision

Azure AI Vision Introduction The Azure AI Vision service gives you access to advanced algorithms that process images and videos and return insights based on the visual features and content you are interested in. Azure AI Vision can power a diverse set of scenarios, including digital asset man

face-detection
image-analysis
optical-character-recognition
Phi-3-small-128k-instruct
Phi-3-small-128k-instruct

Same Phi-3-small model, but with a larger context size for RAG or few shot prompting.

chat-completion
MedImageInsight
MedImageInsight

Most medical imaging AI today is narrowly built to detect a small set of individual findings on a single modality like chest Xrays. This training approach is data and computationally inefficient, requiring ~612 months per finding1, and often fails to generalize in real world environments. By furt

embeddings
Phi-4-mini-reasoning-qnn-npu
Phi-4-mini-reasoning-qnn-npu

This model is an optimized version of Phi4minireasoning to enable local inference on QNN NPUs. This model uses QuaRot and GPTQ quantization. Model Description Developed by: Microsoft Model type: ONNX License: MIT Model Description: This is a conversion of the Phi4mini

chat-completion
Aurora
Aurora

Aurora is a machine learning model that can predict general environmental variables.

environmental-forecasting
Phi-3-mini-4k-instruct
Phi-3-mini-4k-instruct

Tiniest member of the Phi-3 family. Optimized for both quality and low latency.

chat-completion
Azure-AI-Speech
Azure-AI-Speech

Azure AI Speech Introduction The Speech service provides speech to text and text to speech capabilities with a Speech resource. You can transcribe speech to text with high accuracy, produce naturalsounding text to speech voices, translate spoken audio, and use speaker recognition during conv

automatic-speech-recognition
text-to-speech
MatterSim
MatterSim

MatterSim is a largescale pretrained deep learning model for efficient materials emulations and property predictions. MatterSim is a deep learning model for general materials design tasks. It supports efficient atomistic simulations at firstprinciples level and accurate prediction of broad materi

materials-design
supply-chain-trade-regulations
supply-chain-trade-regulations

Description The adapted AI model for supply chain trade regulations analysis (preview) is a 3\.8B parameter, lightweight, state\of\the\art open model, trained using synthetic supply chain domain\specific datasets, focused on trade regulations. The model is fine\tuned on the base model, P

chat-completion
Phi-3.5-mini-instruct
Phi-3.5-mini-instruct

Refresh of Phi-3-mini model.

chat-completion
1