Mistral OCR 25.03
Version: 1
PLAYGROUND WILL SOON BE AVAILABLE FOR OCR
The OCR endpoint returns .MD format. Combine it with Mistral Small 3.1 to return JSON format. See this cookbook for a detailed tutorial.
Mistral OCR 25.03 is an Optical Character Recognition API that sets a new standard in document understanding. Unlike other models, Mistral OCR comprehends each element of documents—media, text, tables, equations—with unprecedented accuracy and cognition. It takes images and PDFs as input and extracts content in an ordered interleaved text and images.
As a result, Mistral OCR 25.03 is an ideal model to use in combination with a RAG system taking multimodal documents (such as slides or complex PDFs) as input.
Mistral OCR 25.03 excels in understanding complex document elements, including interleaved imagery, mathematical expressions, tables, and advanced layouts such as LaTeX formatting. The model enables deeper understanding of rich documents such as scientific papers with charts, graphs, equations and figures.
Being lighter weight than most models in the category, Mistral OCR 25.03 performs significantly faster than its peers, processing thousands of pages per minute. The ability to rapidly process documents ensures continuous learning and improvement even for high-throughput environments.
To further enhance its capabilities, Mistral OCR 25.03 can be coupled with Mistral Small 3.1 to reformat the results. This combination ensures that the extracted content is not only accurate but also presented in a structured and coherent manner, making it suitable for various downstream applications and analyses. Have a look at this cookbook to combine OCR with another model.
Intended Use
Primary Use Cases
Help your organization elevate its knowledge by transforming your extensive document repositories into actions and solutions. Some of the key use cases where Mistral OCR 25.03 is making a significant impact include:- Digitizing scientific research: Leading research institutions have been experimenting with Mistral OCR to convert scientific papers and journals into AI-ready formats, making them accessible to downstream intelligence engines. This has facilitated measurably faster collaboration and accelerated scientific workflows.
- Preserving historical and cultural heritage: Organizations and nonprofits that are custodians of heritage have been using Mistral OCR to digitize historical documents and artifacts, ensuring their preservation and making them accessible to a broader audience.
- Streamlining customer service: Customer service departments are exploring Mistral OCR to transform documentation and manuals into indexed knowledge, reducing response times and improving customer satisfaction.
- Making literature across design, education, legal, etc. AI ready: Mistral OCR has also been helping companies convert technical literature, engineering drawings, lecture notes, presentations, regulatory filings and much more into indexed, answer-ready formats, unlocking intelligence and productivity across millions of documents.
Top-tier benchmarks
Mistral OCR 25.03 has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We extract embedded images from documents along with text. The other LLMs compared below, do not have that capability. For a fair comparison, we evaluate them on our internal “text-only” test-set containing various publication papers, and PDFs from the web; below:
Natively multilingual
Since Mistral’s founding, we have aspired to serve the world with our models, and consequently strived for multilingual capabilities across our offerings. Mistral OCR 25.03 takes this to a new level, being able to parse, understand, and transcribe thousands of scripts, fonts, and languages across all continents. This versatility is crucial for both global organizations that handle documents from diverse linguistic backgrounds, as well as hyperlocal businesses serving niche markets.
Benchmarks per language
Model | Overall | Math | Multilingual | Scanned | Tables |
---|---|---|---|---|---|
Google Document AI | 83.42 | 80.29 | 86.42 | 92.77 | 78.16 |
Azure OCR | 89.52 | 85.72 | 87.52 | 94.65 | 89.52 |
Gemini-1.5-Flash-002 | 90.23 | 89.11 | 86.76 | 94.87 | 90.48 |
Gemini-1.5-Pro-002 | 89.92 | 88.48 | 86.33 | 96.15 | 89.71 |
Gemini-2.0-Flash-001 | 88.69 | 84.18 | 85.80 | 95.11 | 91.46 |
GPT-4o-2024-11-20 | 89.77 | 87.55 | 86.00 | 94.58 | 91.70 |
Mistral OCR 25.03 | 94.89 | 94.29 | 89.55 | 98.96 | 96.12 |
Model | Fuzzy Match in Generation |
---|---|
Google-Document-AI | 95.88 |
Gemini-2.0-Flash-001 | 96.53 |
Azure OCR | 97.31 |
Mistral OCR 25.03 | 99.02 |
Language | Azure OCR | Google Doc AI | Gemini-2.0-Flash-001 | Mistral OCR 2503 |
---|---|---|---|---|
ru | 97.35 | 95.56 | 96.58 | 99.09 |
fr | 97.50 | 96.36 | 97.06 | 99.20 |
hi | 96.45 | 95.65 | 94.99 | 97.55 |
zh | 91.40 | 90.89 | 91.85 | 97.11 |
pt | 97.96 | 96.24 | 97.25 | 99.42 |
de | 98.39 | 97.09 | 97.19 | 99.51 |
es | 98.54 | 97.52 | 97.75 | 99.54 |
tr | 95.91 | 93.85 | 94.66 | 97.00 |
uk | 97.81 | 96.24 | 96.70 | 99.29 |
it | 98.31 | 97.69 | 97.68 | 99.42 |
ro | 96.45 | 95.14 | 95.88 | 98.79 |
Model Specifications
Context Length128000
LicenseCustom
Last UpdatedApril 2025
Input TypePdf,Image
Output TypeText
PublisherMistral AI
Languages27 Languages