Mistral Document AI (25.05)
Mistral Document AI (25.05)
Version: 1
Mistral AILast updated October 2025
Document conversion to markdown with interleaved images and text
Vision
Low latency

Direct from Azure models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

Key capabilities

About this model

Mistral Document AI comes with a Document OCR (Optical Character Recognition) processor, powered by our latest OCR model mistral-ocr-2505, which enables you to extract text and structured content from PDF documents.

Key model capabilities

Enterprise OCR with superior document accuracy

Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages.

State of the Art Document AI

Go beyond raw text extraction. Our AI interprets tables, forms, invoices, and complex layouts with unprecedented accuracy and cognition.

Advanced extraction

Extract to structured JSON with customizable schema, parse forms, classify documents, and process images (text, charts, signatures). Convert charts to tables, extract fine print from figures, or define custom image types.

Multilingual, multimodal

World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.

Fastest in category

Lightweight and blazing fast, Mistral OCR outperforms bulkier alternatives without sacrificing accuracy.

Basic OCR

  • Extracts text content while maintaining document structure and hierarchy
  • Preserves formatting like headers, paragraphs, lists and tables
  • Returns results in markdown format for easy parsing and rendering
  • Handles complex layouts including multi-column text and mixed content
  • Processes documents at scale with high accuracy

Annotations

In addition to the basic OCR functionality, Mistral Document AI API adds the annotations functionality, which allows you to extract information in a structured json-format that you provide. Specifically, it offers two types of annotations:
  • bbox_annotation: gives you the annotation of the bboxes extracted by the OCR model (charts/ figures etc) based on user requirement and provided bbox/image annotation format. The user may ask to describe/caption the figure for instance.
  • document_annotation: returns the annotation of the entire document based on the provided document annotation format.
  • Labeling and annotating data
  • Extraction and structuring of specific information from documents into a predefined JSON format
  • Automation of data extraction to reduce manual entry and errors
  • Efficient handling of large document volumes for enterprise-level applications

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

  • Document-to-data, at scale. Convert physical documents (contracts, invoices, forms, and reports) to custom-structured digital copies in minutes.
  • Extract and analyze. Enable AI-powered insights: detect patterns, validate data, and enhance enterprise search out of scanned documents.
  • Translate and localize. Quickly localize contracts, reports, and correspondences across, with compliance-ready accuracy.
  • Automate workflows with AI. Build end-to-end document pipelines — from OCR digitization to natural language querying, with fully automated structuring in-between.
  • Monitor compliance and manage risk. Automatically audit document flows, redact sensitive data, or enforce retention policies, while keeping full traceability.
For industries needing precision, speed, and compliance in document workflowws
  • Regulated sectors needing audit-ready data extraction.
  • Global enterprises processing multilingual documents in large volumes.
  • Researchers and academic institutions transforming PDFs into structured datasets.
  • Compliance-first organizations requiring secure deployment.

Out of scope use cases

  • Mistral Document AI on Foundry can process documents upto size of 30Mb and 30 pages.
  • Document Annotations are limited to 8 pages.
  • While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts.
Content safety is applied for annotations only, it is not enforced for OCR outputs.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

Supports multiple document formats including:
  • image_url: png, jpeg/jpg, avif and more...
  • document_url: pdf, pptx, docx and more...

Output formats

  • Returns results in markdown format for easy parsing and rendering
  • Returns the extracted text content, images bboxes and metadata about the document structure
  • Extract to structured JSON with customizable schema

Supported languages

World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.
LanguageAzure OCRGoogle Doc AIGemini-2.0-Flash-001Mistral Document AI
ru97.3595.5696.5899.09
fr97.5096.3697.0699.20
hi96.4595.6594.9997.55
zh91.4090.8991.8597.11
pt97.9696.2497.2599.42
de98.3997.0997.1999.51
es98.5497.5297.7599.54
tr95.9193.8594.6697.00
uk97.8196.2496.7099.29
it98.3197.6997.6899.42
ro96.4595.1495.8898.79

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

The provider has not supplied this information.

Optimizing model performance

Top-tier benchmarks Mistral Document AI has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We can extract both images and text embedded in documents, unlike the other models. To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web:
ModelOverallMathMultilingualScannedTables
Azure OCR89.5285.7287.5294.6589.52
Mistral Document AI94.8994.2989.5598.9696.12
GPT-4o-2024-11-2089.7787.5586.0094.5891.70
Gemini-1.5-Flash-00290.2389.1186.7694.8790.48
Gemini-1.5-Pro-00289.9288.4886.3396.1589.71
Gemini-2.0-Flash-00188.6984.1885.8095.1191.46
Google Document AI83.4280.2986.4292.7778.16
Natively multilingual Since Mistral's founding, we have aspired to serve the world with our models, and consequently strived for multilingual capabilities across our offerings. Mistral Document AI takes this to a new level, being able to parse, understand, and transcribe thousands of scripts, fonts, and languages across all continents. This versatility is crucial for both global organizations that handle documents from diverse linguistic backgrounds, as well as hyperlocal businesses serving niche markets.
ModelFuzzy Match in Generation
Azure OCR97.31
Mistral Document AI99.02
Google-Document-AI95.88
Gemini-2.0-Flash-00196.53

Additional assets

The provider has not supplied this information.

Training disclosure

Training, testing and validation

The provider has not supplied this information.

Distribution

Distribution channels

The provider has not supplied this information.

More information

The provider has not supplied this information.

Responsible AI considerations

Safety techniques

Content safety is applied for annotations only, it is not enforced for OCR outputs.

Safety evaluations

The provider has not supplied this information.

Known limitations

  • Mistral Document AI on Foundry can process documents upto size of 30Mb and 30 pages.
  • Document Annotations are limited to 8 pages.
  • While the pure OCR process performs efficiently and quickly, the annotation process can be slower from time to time and may result in timeouts.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: Mistral AI Top-tier benchmarks Mistral Document AI has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior accuracy across multiple aspects of document analysis is illustrated below. We can extract both images and text embedded in documents, unlike the other models. To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web:
ModelOverallMathMultilingualScannedTables
Azure OCR89.5285.7287.5294.6589.52
Mistral Document AI94.8994.2989.5598.9696.12
GPT-4o-2024-11-2089.7787.5586.0094.5891.70
Gemini-1.5-Flash-00290.2389.1186.7694.8790.48
Gemini-1.5-Pro-00289.9288.4886.3396.1589.71
Gemini-2.0-Flash-00188.6984.1885.8095.1191.46
Google Document AI83.4280.2986.4292.7778.16
Natively multilingual Since Mistral's founding, we have aspired to serve the world with our models, and consequently strived for multilingual capabilities across our offerings. Mistral Document AI takes this to a new level, being able to parse, understand, and transcribe thousands of scripts, fonts, and languages across all continents. This versatility is crucial for both global organizations that handle documents from diverse linguistic backgrounds, as well as hyperlocal businesses serving niche markets.
ModelFuzzy Match in Generation
Azure OCR97.31
Mistral Document AI99.02
Google-Document-AI95.88
Gemini-2.0-Flash-00196.53
Benchmarks per language
LanguageAzure OCRGoogle Doc AIGemini-2.0-Flash-001Mistral Document AI
ru97.3595.5696.5899.09
fr97.5096.3697.0699.20
hi96.4595.6594.9997.55
zh91.4090.8991.8597.11
pt97.9696.2497.2599.42
de98.3997.0997.1999.51
es98.5497.5297.7599.54
tr95.9193.8594.6697.00
uk97.8196.2496.7099.29
it98.3197.6997.6899.42
ro96.4595.1495.8898.79
Enterprise OCR with superior document accuracy
Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages.
World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.

Benchmarking methodology

Source: Mistral AI To ensure fairness in our comparison, we evaluated using a test set that is text-only, containing a variety of sources including papers and PDFs sourced from the web.

Public data summary

Source: Mistral AI The provider has not supplied this information.
Model Specifications
Context Length128000
LicenseCustom
Last UpdatedOctober 2025
Input TypePdf,Image
Output TypeText
ProviderMistral AI
Languages27 Languages