Mistral Document AI (25.05)
Version: 1
Direct from Azure models
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Key capabilities
About this model
Mistral Document AI comes with a Document OCR (Optical Character Recognition) processor, powered by our latest OCR modelmistral-ocr-2505, which enables you to extract text and structured content from PDF documents.
Key model capabilities
Enterprise OCR with superior document accuracy
Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages.State of the Art Document AI
Go beyond raw text extraction. Our AI interprets tables, forms, invoices, and complex layouts with unprecedented accuracy and cognition.Advanced extraction
Extract to structured JSON with customizable schema, parse forms, classify documents, and process images (text, charts, signatures). Convert charts to tables, extract fine print from figures, or define custom image types.Multilingual, multimodal
World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.Fastest in category
Lightweight and blazing fast, Mistral OCR outperforms bulkier alternatives without sacrificing accuracy.Basic OCR
- Extracts text content while maintaining document structure and hierarchy
- Preserves formatting like headers, paragraphs, lists and tables
- Returns results in markdown format for easy parsing and rendering
- Handles complex layouts including multi-column text and mixed content
- Processes documents at scale with high accuracy
Annotations
In addition to the basic OCR functionality, Mistral Document AI API adds theannotations functionality, which allows you to extract information in a structured json-format that you provide. Specifically, it offers two types of annotations:
-
bbox_annotation: gives you the annotation of the bboxes extracted by the OCR model (charts/ figures etc) based on user requirement and provided bbox/image annotation format. The user may ask to describe/caption the figure for instance. -
document_annotation: returns the annotation of the entire document based on the provided document annotation format. - Labeling and annotating data
- Extraction and structuring of specific information from documents into a predefined JSON format
- Automation of data extraction to reduce manual entry and errors
- Efficient handling of large document volumes for enterprise-level applications
Use cases
See Responsible AI for additional considerations for responsible use.Key use cases
- Document-to-data, at scale. Convert physical documents (contracts, invoices, forms, and reports) to custom-structured digital copies in minutes.
- Extract and analyze. Enable AI-powered insights: detect patterns, validate data, and enhance enterprise search out of scanned documents.
- Translate and localize. Quickly localize contracts, reports, and correspondences across, with compliance-ready accuracy.
- Automate workflows with AI. Build end-to-end document pipelines — from OCR digitization to natural language querying, with fully automated structuring in-between.
- Monitor compliance and manage risk. Automatically audit document flows, redact sensitive data, or enforce retention policies, while keeping full traceability.
- Regulated sectors needing audit-ready data extraction.
- Global enterprises processing multilingual documents in large volumes.
- Researchers and academic institutions transforming PDFs into structured datasets.
- Compliance-first organizations requiring secure deployment.
Out of scope use cases
- Mistral Document AI on Foundry can process documents upto size of 30Mb and 30 pages.
- Document Annotations are limited to 8 pages.
- While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts.
Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Technical specs
Training cut-off date
The provider has not supplied this information.Training time
The provider has not supplied this information.Input formats
Supports multiple document formats including:image_url: png, jpeg/jpg, avif and more...document_url: pdf, pptx, docx and more...
Output formats
- Returns results in markdown format for easy parsing and rendering
- Returns the extracted text content, images bboxes and metadata about the document structure
- Extract to structured JSON with customizable schema
Supported languages
World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.| Language | Azure OCR | Google Doc AI | Gemini-2.0-Flash-001 | Mistral Document AI |
|---|---|---|---|---|
| ru | 97.35 | 95.56 | 96.58 | 99.09 |
| fr | 97.50 | 96.36 | 97.06 | 99.20 |
| hi | 96.45 | 95.65 | 94.99 | 97.55 |
| zh | 91.40 | 90.89 | 91.85 | 97.11 |
| pt | 97.96 | 96.24 | 97.25 | 99.42 |
| de | 98.39 | 97.09 | 97.19 | 99.51 |
| es | 98.54 | 97.52 | 97.75 | 99.54 |
| tr | 95.91 | 93.85 | 94.66 | 97.00 |
| uk | 97.81 | 96.24 | 96.70 | 99.29 |
| it | 98.31 | 97.69 | 97.68 | 99.42 |
| ro | 96.45 | 95.14 | 95.88 | 98.79 |
Sample JSON response
The provider has not supplied this information.Model architecture
The provider has not supplied this information.Long context
The provider has not supplied this information.Optimizing model performance
Top-tier benchmarks Mistral Document AI has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We can extract both images and text embedded in documents, unlike the other models. To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web:| Model | Overall | Math | Multilingual | Scanned | Tables |
|---|---|---|---|---|---|
| Azure OCR | 89.52 | 85.72 | 87.52 | 94.65 | 89.52 |
| Mistral Document AI | 94.89 | 94.29 | 89.55 | 98.96 | 96.12 |
| GPT-4o-2024-11-20 | 89.77 | 87.55 | 86.00 | 94.58 | 91.70 |
| Gemini-1.5-Flash-002 | 90.23 | 89.11 | 86.76 | 94.87 | 90.48 |
| Gemini-1.5-Pro-002 | 89.92 | 88.48 | 86.33 | 96.15 | 89.71 |
| Gemini-2.0-Flash-001 | 88.69 | 84.18 | 85.80 | 95.11 | 91.46 |
| Google Document AI | 83.42 | 80.29 | 86.42 | 92.77 | 78.16 |
| Model | Fuzzy Match in Generation |
|---|---|
| Azure OCR | 97.31 |
| Mistral Document AI | 99.02 |
| Google-Document-AI | 95.88 |
| Gemini-2.0-Flash-001 | 96.53 |
Additional assets
The provider has not supplied this information.Training disclosure
Training, testing and validation
The provider has not supplied this information.Distribution
Distribution channels
The provider has not supplied this information.More information
The provider has not supplied this information.Responsible AI considerations
Safety techniques
Content safety is applied for annotations only, it is not enforced for OCR outputs.Safety evaluations
The provider has not supplied this information.Known limitations
- Mistral Document AI on Foundry can process documents upto size of 30Mb and 30 pages.
- Document Annotations are limited to 8 pages.
- While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts.
Acceptable use
Acceptable use policy
The provider has not supplied this information.Quality and performance evaluations
Source: Mistral AI Top-tier benchmarks Mistral Document AI has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We can extract both images and text embedded in documents, unlike the other models. To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web:| Model | Overall | Math | Multilingual | Scanned | Tables |
|---|---|---|---|---|---|
| Azure OCR | 89.52 | 85.72 | 87.52 | 94.65 | 89.52 |
| Mistral Document AI | 94.89 | 94.29 | 89.55 | 98.96 | 96.12 |
| GPT-4o-2024-11-20 | 89.77 | 87.55 | 86.00 | 94.58 | 91.70 |
| Gemini-1.5-Flash-002 | 90.23 | 89.11 | 86.76 | 94.87 | 90.48 |
| Gemini-1.5-Pro-002 | 89.92 | 88.48 | 86.33 | 96.15 | 89.71 |
| Gemini-2.0-Flash-001 | 88.69 | 84.18 | 85.80 | 95.11 | 91.46 |
| Google Document AI | 83.42 | 80.29 | 86.42 | 92.77 | 78.16 |
| Model | Fuzzy Match in Generation |
|---|---|
| Azure OCR | 97.31 |
| Mistral Document AI | 99.02 |
| Google-Document-AI | 95.88 |
| Gemini-2.0-Flash-001 | 96.53 |
| Language | Azure OCR | Google Doc AI | Gemini-2.0-Flash-001 | Mistral Document AI |
|---|---|---|---|---|
| ru | 97.35 | 95.56 | 96.58 | 99.09 |
| fr | 97.50 | 96.36 | 97.06 | 99.20 |
| hi | 96.45 | 95.65 | 94.99 | 97.55 |
| zh | 91.40 | 90.89 | 91.85 | 97.11 |
| pt | 97.96 | 96.24 | 97.25 | 99.42 |
| de | 98.39 | 97.09 | 97.19 | 99.51 |
| es | 98.54 | 97.52 | 97.75 | 99.54 |
| tr | 95.91 | 93.85 | 94.66 | 97.00 |
| uk | 97.81 | 96.24 | 96.70 | 99.29 |
| it | 98.31 | 97.69 | 97.68 | 99.42 |
| ro | 96.45 | 95.14 | 95.88 | 98.79 |
Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages. World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.
Benchmarking methodology
Source: Mistral AI To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web.Public data summary
Source: Mistral AI The provider has not supplied this information.Model Specifications
Context Length128000
LicenseCustom
Last UpdatedOctober 2025
Input TypePdf,Image
Output TypeText
ProviderMistral AI
Languages27 Languages