Cohere Embed 4
Version: 4
Cohere’s Embed 4 is a multilingual multimodal embedding model. It is capable of transforming different modalities such as images, texts, and interleaved images and texts into a single vector representation. Embed 4 offers state-of-the-art performance across all modalities (texts, images, interleaved texts and image) and in both English and multilingual settings.
Embed 4 supports a 128k context length and an images can have a maximum of 2MM pixels. Embed 4 is capable of vectorizing interleaved texts and images and capturing key visual features from screenshots of PDFs, slides, tables, figures, and more, thereby eliminating the need for complex document parsing. Embed 4 offers a variety of ways for compression both on the number of dimensions and the number-format precision. The model offers byte and binary quantization and matryoshka embeddings for further compression.
Embed-v4.0 Evaluations
The following tables showcase Embed-v4.0 Evaluations against other Embedding Models. We breakdown datasets into public/academic benchmarks as well as the dataset modality.Evaluation Datasets:
Our evaluations range from text-only, image-only, mixed-modality, and fused datasets.Generic Academic Datasets
BEIR
BEIR is a standard benchmark dataset for general-domain information retrieval. It features the following monolingual setup: English Queries to an English Corpus The domain is diverse, covering 18 tasks across areas such as fact-checking, biomedical, news, and question answering. The corpora are drawn from various sources including Wikipedia, scientific articles, and web forums. The queries are a mix of natural user queries, questions, and information needs, depending on the dataset. All metrics are NDCG@10Model | Number of Dimensions | nfcorpus | scifact | arguana | scidocs | fiqa | trec-covid | webis-touche2020 | quora | nq | dbpedia-entity | hotpotqa | fever | climate-fever | Average | Differential |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Open AI Text Embedding Large | 1536 | 42.07 | 77.77 | 57.99 | 23.07 | 55.00 | 79.56 | 23.36 | 89.05 | 61.27 | 44.76 | 71.58 | 87.94 | 30.28 | 57.21 | -0.36 |
Cohere - Embed V3 | 1024 | 38.43 | 72.55 | 56.83 | 20.27 | 42.17 | 79.09 | 32.40 | 86.40 | 61.60 | 43.40 | 70.70 | 89.00 | 25.80 | 55.28 | -2.28 |
Cohere - Embed V4 | 1536 | 40.25 | 77.10 | 57.32 | 20.97 | 54.50 | 69.36 | 33.67 | 89.22 | 68.46 | 46.53 | 73.50 | 84.63 | 32.84 | 57.56 | 0.00 |
MIRACL
MIRACL is a standard multilingual information retrieval dataset. It features the following monolingual setups: Multilingual Language to its respective Multilingual Language Wikipedia Corpus The domain is encyclopedic knowledge, with the corpus sourced from Wikipedia in each supported language. The queries are crowdsourced, modeled after real user search intents, and are available in both original languages and English translations for cross-lingual evaluation. We benchmarked on a subset of MIRACL with a focus on the most-popular languages and only in a monolingual setting All metrics are NDCG@10Model | Number of Dimensions | Arabic | German | Spanish | French | Hindi | Japanese | Korean | Russian | Chinese | Average | Differential |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ColQwen | Mutlivector | 74.2 | 67.4 | 78.2 | 67.3 | 65.8 | 66.0 | 73.8 | 68.5 | 69.7 | 70.09 | -10.72 |
GME (2bn) | 1536 | 78.9 | 74.0 | 80.9 | 72.0 | 64.6 | 72.6 | 73.3 | 75.7 | 78.8 | 74.54 | -6.27 |
GME (7bn) | 3584 | 83.2 | 79.9 | 84.5 | 77.8 | 72.1 | 77.2 | 76.3 | 81.4 | 80.6 | 79.21 | -1.59 |
Open AI Text Embedding Large | 1536 | 83.6 | 81.6 | 86.0 | 80.0 | 65.2 | 79.7 | 74.6 | 82.9 | 81.7 | 79.48 | -1.33 |
Cohere - Embed V3 | 1024 | 86.0 | 80.3 | 84.0 | 76.9 | 77.7 | 77.3 | 79.9 | 80.6 | 80.1 | 80.30 | -0.51 |
Cohere - Embed V4 | 1024 | 86.1 | 81.5 | 86.5 | 78.3 | 75.5 | 78.4 | 78.3 | 81.2 | 81.5 | 80.81 | 0.00 |
NeuCLIR
NeuCLIR is a standard cross-lingual information retrieval dataset. It features the following cross-lingual setups: English Queries to a Chinese CorpusEnglish Queries to a Russian Corpus
English Queries to a Farsi Corpus The domain is news articles, with corpora sourced from multilingual news sources. The queries were created by human annotators to reflect realistic information needs across languages. All metrics are NDCG@10
Model | Number of Dimensions | English Queries / Farsi Corpus | English Queries / Russian Corpus | English Queries / Chinese Corpus | Average | Differential |
---|---|---|---|---|---|---|
ColQwen | Mutlivector | 26.7 | 33.6 | 32.7 | 30.98 | -16.33 |
GME (2bn) | 1536 | 43.4 | 41.0 | 40.2 | 41.51 | -5.80 |
GME (7bn) | 3584 | 45.1 | 47.4 | 45.3 | 45.94 | -1.37 |
Open AI Text Embedding Large | 1536 | 43.7 | 48.4 | 42.1 | 44.75 | -2.56 |
Cohere - Embed V3 | 1024 | 45.7 | 47.4 | 42.0 | 45.04 | -2.27 |
Cohere - Embed V4 | 1024 | 48.6 | 49.6 | 43.7 | 47.31 | 0.00 |
ViDoRe Benchmark v2
ViDoRe Benchmark v2 is a comprehensive evaluation suite for visual document retrieval systems. It features the following multilingual and multimodal setups: The benchmark spans diverse domains, including biomedical research, economics, and environmental, social, and governance (ESG) reports. The corpora are sourced from publicly available documents such as academic papers, government reports, and industry publications. Queries are generated through a hybrid approach of synthetic generation and human-in-the-loop refinement, ensuring they reflect realistic and complex information needs.- Axa Insurance contains Axa Group Insurance Policies documents in French and the queries are in the following languages: English, French, German, and Spanish
- MiT Biomedical contains MiT Anatomy Course Lecture Slides in English and the queries are in the following languages: English, French, German, and Spanish
- RSE Restuarant contains ESG reports from companies in the Fast Casual / Quick Serve Industry in English and the queries are in the following languages: English, French, German, and Spanish
- Synthetic Macro contains World Economic reports in Englishand the queries are in the following languages: English, French, German, and Spanish
Model | Number of Dimensions | Axa Insurance | MiT BioMedical | RSE Restuarant | Synthetic Macro | Average | Differential |
---|---|---|---|---|---|---|---|
ColQwen | Multivector | 52.0 | 53.6 | 46.7 | 48.5 | 50.20 | -5.91 |
GME (2bn) | 1536 | 59.6 | 54.3 | 54.4 | 54.0 | 55.59 | -0.51 |
GME (7bn) | 3584 | 59.9 | 53.3 | 47.4 | 53.7 | 53.58 | -2.53 |
Cohere - Embed V3 | 1024 | 52.7 | 55.3 | 54.5 | 50.1 | 53.15 | -2.96 |
Cohere - Embed V4 | 1536 | 63.9 | 58.3 | 53.3 | 48.9 | 56.11 | 0.00 |
Industry Specific Datasets:
MPMQA
MPMQA is a specialized information retrieval dataset focused on product manuals. It features the following monolingual setup: English Queries to an English Product Manual Corpus The domain is technical support and product documentation, with the corpus sourced from real-world manuals across various consumer electronics and appliances. The queries are natural language questions derived from customer support scenarios, reflecting realistic user information needs. All metrics are NDCG@10Model | Number of Dimensions | MPMQA | Differential |
---|---|---|---|
ColQwen | Mutlivector | 67.1 | -6.48 |
GME (2bn) | 1536 | 55.9 | -17.69 |
GME (7bn) | 3584 | 53.5 | -20.06 |
Open AI Text Embedding Large | 1536 | 58.5 | -15.07 |
Cohere - Embed V3 | 1024 | 56.7 | -16.90 |
Cohere - Embed V4 | 1024 | 73.6 | 0.00 |
BioASQ
BioASQ is a standard biomedical information retrieval dataset. It features the following monolingual setup: English Queries to English Biomedical Corpus The domain is biomedical research, with the corpus sourced from PubMed abstracts. The queries are expert-annotated, derived from real biomedical questions. All metrics are NDCG@10Model | Number of Dimensions | MPMQA NDCG@10 | Differential |
---|---|---|---|
ColQwen | Mutlivector | 72.8 | 2.46 |
GME (2bn) | 1536 | 46.4 | -23.99 |
GME (7bn) | 3584 | 49.7 | -20.66 |
Open AI Text Embedding Large | 1536 | 61.1 | -9.24 |
Cohere - Embed V3 | 1024 | 44.5 | -25.89 |
Cohere - Embed V4 | 1024 | 70.4 | 0.00 |
Finance Bench
FinanceBench is a benchmark dataset designed for evaluating large language models (LLMs) in financial question answering. It features the following monolingual setup: English Queries to English Financial Documents (SEC Filings such as 10Ks, 10Qs, 8K Filings) The domain encompasses financial data, with the corpus containing information about publicly traded companies. The dataset comprises 10,231 questions that are ecologically valid, covering a diverse set of scenarios related to financial question answering. These questions are intended to be clear-cut and straightforward to answer, serving as a minimum performance standard for LLMs. All metrics are NDCG@10Model | Number of Dimensions | FinanceBench |
---|---|---|
ColQwen | Multivector | 14.4 |
GME (2bn) | 1536 | 25.0 |
GME (7bn) | 3584 | 37.1 |
Open AI Text Embedding Large | 1536 | 34.1 |
Cohere - Embed V3 | 1024 | 32.8 |
Cohere - Embed V4 | 1536 | 61.4 |
Model Specifications
Context Length131072
LicenseCustom
Last UpdatedMay 2025
Input TypeImage,Text
Output TypeImage,Text
PublisherCohere
Languages10 Languages