Overview
Mistral AI is a Paris‑based startup founded in April 2023 by former DeepMind and Meta researchers. In June 2024, it secured a €600 million Series B round that pushed its valuation to €6 billion, making it Europe’s highest‑valued generative‑AI company. The firm develops high‑performance, open‑weight LLMs such as Mistral 7B, Mixtral 8×22B, and the 123‑billion‑parameter flagship Mistral Large, which debuted through a strategic partnership with Microsoft Azure. Its sparse Mixture‑of‑Experts architecture and permissive licensing drive efficient inference and rapid adoption across both open‑source and enterprise communities.Key Azure AI Foundry Models (July 2025)
- Mistral Large 2 (24‑11) – 128 K context, advanced function calling, best‑in‑class multilingual.
- Mixtral 8×22B‑Instruct – MoE architecture delivering 90 % GSM8K accuracy at lower cost.
Why Mistral on Azure
Get European‑licensed open weights with Azure’s Content Safety, deterministic scaling, and pay‑as‑you‑go billing, ideal for regulated industries wanting transparent models.Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.
Enhanced Mistral Small 3 with multimodal capabilities and a 128k context length.
Ministral 3B is a state-of-the-art Small Language Model (SLM) optimized for edge computing and on-device applications. As it is designed for low-latency and compute-efficient inference, it it also the perfect model for standard GenAI applications that have
Codestral 25.01 by Mistral AI is designed for code generation, supporting 80+ programming languages, and optimized for tasks like code completion and fill-in-the-middle
Model Details The Mistral7BInstructv0.1 Large Language Model (LLM) is a instruct finetuned version of the Mistral7Bv0.1 generative text model using a variety of publicly available conversation datasets. For full details of this model
Mistral Small can be used on any language-based task that requires high efficiency and low latency.
Document conversion to markdown with interleaved images and text
The Mixtral8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral8x7B outperforms Llama 2 70B on most benchmarks with 6x faster inference. Mixtral8x7Bv0.1 is a decoderonly model with 8 distinct groups or the "experts". At every layer, for every token,
The Mixtral8x22BInstructv0.1 Large Language Model (LLM) is an instruct finetuned version of the Mixtral8x22Bv0.1. Inference samples Inference type|Python sample (Notebook)|CLI with YAML |||| Real time|<a href="https://aka.ms/azu
Model Details The Mistral7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral7Bv0.1 outperforms Llama 2 13B on all benchmarks tested. For full details of this model please read paper and [release b
The Mixtral8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. Mixtral8x22Bv0.1 is a pretrained base model and therefore does not have any moderation mechanisms. Evaluation Results [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/Hugg
Mistral Large (2407) is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge and coding capabilities.
Mistral Nemo is a cutting-edge Language Model (LLM) boasting state-of-the-art reasoning, world knowledge, and coding capabilities within its size category.
The Mistral7BInstructv0.2 Large Language Model (LLM) is an instruct finetuned version of the Mistral7Bv0.2. Mistral7Bv0.2 has the following changes compared to Mistral7Bv0.1: 32k context window (vs 8k context in v0.1) Ropetheta = 1e6 No SlidingWindow Attention For full details of
Model Details The Mixtral8x7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mixtral8x7Bv0.1 outperforms Llama 2 70B on most benchmarks with 6x faster inference. For full details of this model please read [release blog post](https://mistr
The Mixtral8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. Mixtral8x22Bv0.1 is a pretrained base model and therefore does not have any moderation mechanisms. Evaluation Results [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/Hugg
Mistral Large 24.11 offers enhanced system prompts, advanced reasoning and function calling capabilities.