Mistral Document AI (25.05)
Version: 1
Mistral Document AI offers enterprise-level document processing, combining cutting-edge OCR technology with advanced structured data extraction. Experience faster processing speeds, unparalleled accuracy, and cost-effective solutions, all scalable to meet your needs. Unlock the full potential of your documents with our multilingual support, annotations and adaptable workflows for many document types, enabling you to extract, comprehend, and analyze information with ease.
Why Mistral for Document AI?
Enterprise OCR with superior document accuracy
Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages.State of the Art Document AI
Go beyond raw text extraction. Our AI interprets tables, forms, invoices, and complex layouts with unprecedented accuracy and cognition.Advanced extraction
Extract to structured JSON with customizable schema, parse forms, classify documents, and process images (text, charts, signatures). Convert charts to tables, extract fine print from figures, or define custom image types.Multilingual, multimodal
World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.Fastest in category
Lightweight and blazing fast, Mistral OCR outperforms bulkier alternatives without sacrificing accuracy.For industries needing precision, speed, and compliance in document workflowws
- Regulated sectors needing audit-ready data extraction.
- Global enterprises processing multilingual documents in large volumes.
- Researchers and academic institutions transforming PDFs into structured datasets.
- Compliance-first organizations requiring secure deployment.
Content Safety
Content safety is applied for annotations only, it is not enforced for OCR outputs.Intended Use
Primary Use Cases
- Document-to-data, at scale. Convert physical documents (contracts, invoices, forms, and reports) to custom-structured digital copies in minutes.
- Extract and analyze. Enable AI-powered insights: detect patterns, validate data, and enhance enterprise search out of scanned documents.
- Translate and localize. Quickly localize contracts, reports, and correspondences across, with compliance-ready accuracy.
- Automate workflows with AI. Build end-to-end document pipelines — from OCR digitization to natural language querying, with fully automated structuring in-between.
- Monitor compliance and manage risk. Automatically audit document flows, redact sensitive data, or enforce retention policies, while keeping full traceability.
Basic OCR
Mistral Document AI comes with a Document OCR (Optical Character Recognition) processor, powered by our latest OCR modelmistral-ocr-2505
, which enables you to extract text and structured content from PDF documents.
Key Features
- Extracts text content while maintaining document structure and hierarchy
- Preserves formatting like headers, paragraphs, lists and tables
- Returns results in markdown format for easy parsing and rendering
- Handles complex layouts including multi-column text and mixed content
- Processes documents at scale with high accuracy
- Supports multiple document formats including:
image_url
: png, jpeg/jpg, avif and more...document_url
: pdf, pptx, docx and more...
The OCR processor returns the extracted text content, images bboxes and metadata about the document structure, making it easy to work with the recognized content programmatically.
Annotations
In addition to the basic OCR functionality, Mistral Document AI API adds theannotations
functionality, which allows you to extract information in a structured json-format that you provide. Specifically, it offers two types of annotations:
bbox_annotation
: gives you the annotation of the bboxes extracted by the OCR model (charts/ figures etc) based on user requirement and provided bbox/image annotation format. The user may ask to describe/caption the figure for instance.document_annotation
: returns the annotation of the entire document based on the provided document annotation format.
Key Features
- Labeling and annotating data
- Extraction and structuring of specific information from documents into a predefined JSON format
- Automation of data extraction to reduce manual entry and errors
- Efficient handling of large document volumes for enterprise-level applications
Limitations & Known Issues
- Mistral Document AI on Foundry can process documents upto size of 30Mb and 30 pages.
- Document Annotations are limited to 8 pages.
- While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts.
Top-tier benchmarks
Mistral Document AI has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We can extract both images and text embedded in documents, unlike the other models. To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web:
Natively multilingual
Since Mistral’s founding, we have aspired to serve the world with our models, and consequently strived for multilingual capabilities across our offerings. Mistral Document AI takes this to a new level, being able to parse, understand, and transcribe thousands of scripts, fonts, and languages across all continents. This versatility is crucial for both global organizations that handle documents from diverse linguistic backgrounds, as well as hyperlocal businesses serving niche markets.
Benchmarks per language
Model | Overall | Math | Multilingual | Scanned | Tables |
---|---|---|---|---|---|
Azure OCR | 89.52 | 85.72 | 87.52 | 94.65 | 89.52 |
Mistral Document AI | 94.89 | 94.29 | 89.55 | 98.96 | 96.12 |
GPT-4o-2024-11-20 | 89.77 | 87.55 | 86.00 | 94.58 | 91.70 |
Gemini-1.5-Flash-002 | 90.23 | 89.11 | 86.76 | 94.87 | 90.48 |
Gemini-1.5-Pro-002 | 89.92 | 88.48 | 86.33 | 96.15 | 89.71 |
Gemini-2.0-Flash-001 | 88.69 | 84.18 | 85.80 | 95.11 | 91.46 |
Google Document AI | 83.42 | 80.29 | 86.42 | 92.77 | 78.16 |
Model | Fuzzy Match in Generation |
---|---|
Azure OCR | 97.31 |
Mistral Document AI | 99.02 |
Google-Document-AI | 95.88 |
Gemini-2.0-Flash-001 | 96.53 |
Language | Azure OCR | Google Doc AI | Gemini-2.0-Flash-001 | Mistral Document AI |
---|---|---|---|---|
ru | 97.35 | 95.56 | 96.58 | 99.09 |
fr | 97.50 | 96.36 | 97.06 | 99.20 |
hi | 96.45 | 95.65 | 94.99 | 97.55 |
zh | 91.40 | 90.89 | 91.85 | 97.11 |
pt | 97.96 | 96.24 | 97.25 | 99.42 |
de | 98.39 | 97.09 | 97.19 | 99.51 |
es | 98.54 | 97.52 | 97.75 | 99.54 |
tr | 95.91 | 93.85 | 94.66 | 97.00 |
uk | 97.81 | 96.24 | 96.70 | 99.29 |
it | 98.31 | 97.69 | 97.68 | 99.42 |
ro | 96.45 | 95.14 | 95.88 | 98.79 |
Model Specifications
Context Length128000
LicenseCustom
Last UpdatedAugust 2025
Input TypePdf,Image
Output TypeText
PublisherMistral AI
Languages27 Languages