Mistral Document AI (25.12)
Version: 1
Direct from Azure models
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Key capabilities
About this model
Mistral Document AI comes with an improved Document OCR (Optical Character Recognition) processor, powered by our latest OCR model, mistral-ocr-2512, which enables you to extract text and structured content from PDFs and a variety of document types. Mistral Document AI offers enterprise-level document processing, combining cutting-edge OCR technology with advanced structured data extraction. Experience faster processing speeds, unparalleled accuracy, and cost-effective solutions, all scalable to meet your needs. Unlock the full potential of your documents with our multilingual support, annotations and adaptable workflows for many document types, enabling you to extract, comprehend, and analyze information with ease.Enterprise OCR with superior document accuracy
Digitize text from images, PDFs, and a variety of document formats. Extract and understand complex text, handwriting, tables, forms, and images from any document, with benchmark-leading accuracy across global languages.SOTA doc AI
Our latest model is designed to excel at:- Handwriting: Mistral OCR accurately interprets cursive, mixed-content annotations, and handwritten text layered over printed forms.
- Forms: Improved detection of boxes, labels, handwritten entries, and dense layouts. Works well on invoices, receipts, compliance forms, government documents, and such.
- Scanned & complex documents: Significantly more robust to compression artifacts, skew, distortion, low DPI, and background noise.
- Complex tables: Reconstructs table structures with headers, merged cells, multi-row blocks, and column hierarchies. Outputs HTML table tags with colspan/rowspan to fully preserve layout.
Advanced extraction
Our latest OCR updates table extraction formatting is configurable between default markdown, markdown tables, and HTML tables, allowing for advanced table extraction support. In addition, explicit header and footer extraction is available via configurable parameters.Multilingual, multimodal
World-class multilingual OCR: outperforms other solutions with benchmark-leading accuracy across 25+ languages.Fastest in category
Lightweight and blazing fast, Mistral OCR outperforms bulkier alternatives without sacrificing accuracy.For industries needing precision, speed, and compliance in document workflowws
- Regulated sectors needing audit-ready data extraction.
- Global enterprises processing multilingual documents in large volumes.
- Researchers and academic institutions transforming PDFs into structured datasets.
- Compliance-first organizations requiring secure deployment.
Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Intended Use
Primary Use Cases
- Document-to-data, at scale. Convert physical documents (contracts, invoices, forms, and reports) to custom-structured digital copies in minutes.
- Extract and analyze. Enable AI-powered insights: detect patterns, validate data, and enhance enterprise search out of scanned documents.
- Translate and localize. Quickly localize contracts, reports, and correspondences across, with compliance-ready accuracy.
- Automate workflows with AI. Build end-to-end document pipelines — from OCR digitization to natural language querying, with fully automated structuring in-between.
- Monitor compliance and manage risk. Automatically audit document flows, redact sensitive data, or enforce retention policies, while keeping full traceability.
Basic OCR
Mistral Document AI comes with an improved Document OCR (Optical Character Recognition) processor, powered by our latest OCR model,mistral-ocr-2512, which enables you to extract text and structured content from PDFs and a variety of document types.
Our latest model improves upon the previous OCR version (mistral-ocr-2507) with a 74% win-rate improvement on forms, scanned documents, complex tables, and handwriting.
Key Features
- Extracts text content while maintaining document structure and hierarchy
- Preserves formatting like headers, paragraphs, lists and tables
- Returns results in markdown format for easy parsing and rendering
- Handles complex layouts including multi-column text and mixed content
- Processes documents at scale with high accuracy
- Supports multiple document formats including:
image_url: png, jpeg/jpg, avif, png, tiff, gif, heic/heif, bmp, webpdocument_url: pdf, pptx, docx, txt, epub, xml, rtf, odt, bib, fb2, ipynb, xml, tex, opml, man
The OCR processor returns the extracted text content, images bboxes and metadata about the document structure, making it easy to work with the recognized content programmatically.
Annotations
In addition to the basic OCR functionality, Mistral Document AI API adds theannotations functionality, which allows you to extract information in a structured json-format that you provide. Specifically, it offers two types of annotations:
bbox_annotation: gives you the annotation of the bboxes extracted by the OCR model (charts/ figures etc) based on user requirement and provided bbox/image annotation format. The user may ask to describe/caption the figure for instance.document_annotation: returns the annotation of the entire document based on the provided document annotation format.
Key Features
- Labeling and annotating data
- Extraction and structuring of specific information from documents into a predefined JSON format
- Automation of data extraction to reduce manual entry and errors
- Efficient handling of large document volumes for enterprise-level applications
Limitations & Known Issues
- Mistral Document AI on Foundry can process documents up to 30Mb and 30 pages.
- Document Annotations are limited to 8 pages.
- While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts. An optimization will be available in a few weeks timeline.
Supported languages:
de, fr, es, nl, it, pt, hu, pl, cs, da, ro, no, sv, id, th, vi, tl, ar, he, hi, bn, gu, kn, ta, te, ml, en, ru, uk, ko, ja, zh, tr, hy, kaPreview Terms
This Azure Direct Model is a Preview and is subject to the Supplemental Terms of Use for Microsoft Azure PreviewsQuality and performance evaluations
Top-tier benchmarks
To raise the bar, we introduced more challenging internal benchmarks based on real business use-case examples from customers. We then evaluated several models across the domains highlighted below, comparing their outputs to ground truth using fuzzy-match metric for accuracy.:| Model | Forms | Handwritten | Invoices | Complex Tables | Historical Scanned |
|---|---|---|---|---|---|
| DeepSeek OCR | 82.6 | 57.2 | 70.5 | 84.4 | 81.1 |
| Google Document AI | 79.6 | 73.9 | 72.4 | 75.9 | 87.1 |
| Azure OCR | 86.2 | 78.2 | 80.2 | 85.9 | 83.7 |
| AWS Textract | 84.5 | 72.4 | 78.4 | 84.8 | 81.0 |
| Mistral Document AI | 95.9 | 88.9 | 91.8 | 96.6 | 96.7 |
| Language Group | DeepSeek OCR | Azure OCR | Google Doc AI | AWS Textract | Mistral Document AI |
|---|---|---|---|---|---|
| Chinese | 90.5 | 87.0 | 83.3 | N/A | 97.1 |
| East-Asian | 90.7 | 89.9 | 80.7 | N/A | 97.6 |
| Eastern Europe | 90.3 | 94.5 | 91.4 | 88.9 | 98.6 |
| English | 94.6 | 93.5 | 91.6 | 93.9 | 98.6 |
| Western Europe | 94.6 | 94.3 | 92.2 | 94.3 | 98.8 |
Model Specifications
Context Length128000
LicenseCustom
Last UpdatedFebruary 2026
Input TypePdf,Image
Output TypeText
ProviderMistral AI
Languages35 Languages