mistral-document-ai-2505

mistral-document-ai-2505

Document conversion to markdown with interleaved images and text
Mistral AI
Direct from Azure
Version: 1
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

About this model

Mistral Document AI comes with a Document OCR (Optical Character Recognition) processor, powered by our latest OCR model mistral-ocr-2505, which enables you to extract text and structured content from PDF documents.

Key model capabilities

Enterprise OCR with superior document accuracy

Digitize text from images or pdf document files. Extract and understand complex text, handwriting, tables, and images from any document, with 99%+ accuracy across global languages.

State of the Art Document AI

Go beyond raw text extraction. Our AI interprets tables, forms, invoices, and complex layouts with unprecedented accuracy and cognition.

Advanced extraction

Extract to structured JSON with customizable schema, parse forms, classify documents, and process images (text, charts, signatures). Convert charts to tables, extract fine print from figures, or define custom image types.

Multilingual, multimodal

World-class multilingual OCR: outperforms other solutions with 99%+ accuracy across 25+ languages.

Fastest in category

Lightweight and blazing fast, Mistral OCR outperforms bulkier alternatives without sacrificing accuracy.

Basic OCR

  • Extracts text content while maintaining document structure and hierarchy
  • Preserves formatting like headers, paragraphs, lists and tables
  • Returns results in markdown format for easy parsing and rendering
  • Handles complex layouts including multi-column text and mixed content
  • Processes documents at scale with high accuracy

Annotations

In addition to the basic OCR functionality, Mistral Document AI API adds the annotations functionality, which allows you to extract information in a structured json-format that you provide. Specifically, it offers two types of annotations:
  • bbox_annotation: gives you the annotation of the bboxes extracted by the OCR model (charts/ figures etc) based on user requirement and provided bbox/image annotation format. The user may ask to describe/caption the figure for instance.
  • document_annotation: returns the annotation of the entire document based on the provided document annotation format.
  • Labeling and annotating data
  • Extraction and structuring of specific information from documents into a predefined JSON format
  • Automation of data extraction to reduce manual entry and errors
  • Efficient handling of large document volumes for enterprise-level applications

Quick facts

Model providerMistral AI
TypeImage to text
LifecycleGenerally available (GA)
Input typepdf, image
Output typetext
Context window128k
Token limits4096 output