Overview
Mistral AI is a Paris‑based startup founded in April 2023 by former DeepMind and Meta researchers. In June 2024, it secured a €600 million Series B round that pushed its valuation to €6 billion, making it Europe’s highest‑valued generative‑AI company. The firm develops high‑performance, open‑weight LLMs such as Mistral 7B, Mixtral 8×22B, and the 123‑billion‑parameter flagship Mistral Large, which debuted through a strategic partnership with Microsoft Azure. Its sparse Mixture‑of‑Experts architecture and permissive licensing drive efficient inference and rapid adoption across both open‑source and enterprise communities.Key Mistral Models (July 2025)
- Mistral Large 2 (24‑11) – 128 K context, advanced function calling, best‑in‑class multilingual.
- Mixtral 8×22B‑Instruct – MoE architecture delivering 90 % GSM8K accuracy at lower cost.
Why Mistral on Azure
Get European‑licensed open weights with Azure’s Content Safety, deterministic scaling, and pay‑as‑you‑go billing, ideal for regulated industries wanting transparent models.Mistral Large 3 is a state-of-the-art General-purpose Multimodal granular Mixture-of-Experts model with 39B active parameters, 673B total parameters featuring 128 experts per layer and Multi-Latent attention.
Document conversion to markdown with interleaved images and text
Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.
Document conversion to markdown with interleaved images and text
Enhanced Mistral Small 3 with multimodal capabilities and a 128k context length.
Ministral 3B is a state-of-the-art Small Language Model (SLM) optimized for edge computing and on-device applications. As it is designed for low-latency and compute-efficient inference, it it also the perfect model for standard GenAI applications that have
Codestral 25.01 by Mistral AI is designed for code generation, supporting 80+ programming languages, and optimized for tasks like code completion and fill-in-the-middle
Model Details The Mistral7BInstructv0.1 Large Language Model (LLM) is a instruct finetuned version of the Mistral7Bv0.1 generative text model using a variety of publicly available conversation datasets. For full details of this model
Mistral Small can be used on any language-based task that requires high efficiency and low latency.
Ministral 8B is a state-of-the-art Small Language Model (SLM) optimized for edge computing and on-device applications. As it is designed for low-latency and compute-efficient inference, it it also the perfect model for standard GenAI applications that have
The Mixtral8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral8x7B outperforms Llama 2 70B on most benchmarks with 6x faster inference. Mixtral8x7Bv0.1 is a decoderonly model with 8 distinct groups or the "experts". At every layer, for every token,
The Mixtral8x22BInstructv0.1 Large Language Model (LLM) is an instruct finetuned version of the Mixtral8x22Bv0.1. Inference samples Inference type|Python sample (Notebook)|CLI with YAML |||| Real time|<a href="https://aka.ms/azu
Model Details The Mistral7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral7Bv0.1 outperforms Llama 2 13B on all benchmarks tested. For full details of this model please read paper and [release b
Mistral Large (2407) is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge and coding capabilities.
Mistral Nemo is a cutting-edge Language Model (LLM) boasting state-of-the-art reasoning, world knowledge, and coding capabilities within its size category.
The Mixtral8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. Mixtral8x22Bv0.1 is a pretrained base model and therefore does not have any moderation mechanisms. Evaluation Results [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/Hugg
Model Details The Mixtral8x7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mixtral8x7Bv0.1 outperforms Llama 2 70B on most benchmarks with 6x faster inference. For full details of this model please read [release blog post](https://mistr
Mistral Large 24.11 offers enhanced system prompts, advanced reasoning and function calling capabilities.
The Mixtral8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. Mixtral8x22Bv0.1 is a pretrained base model and therefore does not have any moderation mechanisms. Evaluation Results [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/Hugg
The Mistral7BInstructv0.2 Large Language Model (LLM) is an instruct finetuned version of the Mistral7Bv0.2. Mistral7Bv0.2 has the following changes compared to Mistral7Bv0.1: 32k context window (vs 8k context in v0.1) Ropetheta = 1e6 No SlidingWindow Attention For full details of