Cohere Embed 4
Cohere Embed 4
Version: 4
CohereLast updated May 2025
Embed 4 transforms texts and images into numerical vectors
Multilingual
Multimodal
Cohere’s Embed 4 is a multilingual multimodal embedding model. It is capable of transforming different modalities such as images, texts, and interleaved images and texts into a single vector representation. Embed 4 offers state-of-the-art performance across all modalities (texts, images, interleaved texts and image) and in both English and multilingual settings. Embed 4 supports a 128k context length and an images can have a maximum of 2MM pixels. Embed 4 is capable of vectorizing interleaved texts and images and capturing key visual features from screenshots of PDFs, slides, tables, figures, and more, thereby eliminating the need for complex document parsing. Embed 4 offers a variety of ways for compression both on the number of dimensions and the number-format precision. The model offers byte and binary quantization and matryoshka embeddings for further compression.

Embed-v4.0 Evaluations

The following tables showcase Embed-v4.0 Evaluations against other Embedding Models. We breakdown datasets into public/academic benchmarks as well as the dataset modality.

Evaluation Datasets:

Our evaluations range from text-only, image-only, mixed-modality, and fused datasets.

Generic Academic Datasets

BEIR

BEIR is a standard benchmark dataset for general-domain information retrieval. It features the following monolingual setup: English Queries to an English Corpus The domain is diverse, covering 18 tasks across areas such as fact-checking, biomedical, news, and question answering. The corpora are drawn from various sources including Wikipedia, scientific articles, and web forums. The queries are a mix of natural user queries, questions, and information needs, depending on the dataset. All metrics are NDCG@10
ModelNumber of Dimensionsnfcorpusscifactarguanascidocsfiqatrec-covidwebis-touche2020quoranqdbpedia-entityhotpotqafeverclimate-feverAverageDifferential
Open AI Text Embedding Large153642.0777.7757.9923.0755.0079.5623.3689.0561.2744.7671.5887.9430.2857.21-0.36
Cohere - Embed V3102438.4372.5556.8320.2742.1779.0932.4086.4061.6043.4070.7089.0025.8055.28-2.28
Cohere - Embed V4153640.2577.1057.3220.9754.5069.3633.6789.2268.4646.5373.5084.6332.8457.560.00
Embed-v4.0 was not optimized for BEIR as it is largely saturated dataset - most of the models listed on MTEB's leaderboard have optimized instructions for BEIR datasets

MIRACL

MIRACL is a standard multilingual information retrieval dataset. It features the following monolingual setups: Multilingual Language to its respective Multilingual Language Wikipedia Corpus The domain is encyclopedic knowledge, with the corpus sourced from Wikipedia in each supported language. The queries are crowdsourced, modeled after real user search intents, and are available in both original languages and English translations for cross-lingual evaluation. We benchmarked on a subset of MIRACL with a focus on the most-popular languages and only in a monolingual setting All metrics are NDCG@10
ModelNumber of DimensionsArabicGermanSpanishFrenchHindiJapaneseKoreanRussianChineseAverageDifferential
ColQwenMutlivector74.267.478.267.365.866.073.868.569.770.09-10.72
GME (2bn)153678.974.080.972.064.672.673.375.778.874.54-6.27
GME (7bn)358483.279.984.577.872.177.276.381.480.679.21-1.59
Open AI Text Embedding Large153683.681.686.080.065.279.774.682.981.779.48-1.33
Cohere - Embed V3102486.080.384.076.977.777.379.980.680.180.30-0.51
Cohere - Embed V4102486.181.586.578.375.578.478.381.281.580.810.00

NeuCLIR

NeuCLIR is a standard cross-lingual information retrieval dataset. It features the following cross-lingual setups: English Queries to a Chinese Corpus
English Queries to a Russian Corpus
English Queries to a Farsi Corpus
The domain is news articles, with corpora sourced from multilingual news sources. The queries were created by human annotators to reflect realistic information needs across languages. All metrics are NDCG@10
ModelNumber of DimensionsEnglish Queries / Farsi CorpusEnglish Queries / Russian CorpusEnglish Queries / Chinese CorpusAverageDifferential
ColQwenMutlivector26.733.632.730.98-16.33
GME (2bn)153643.441.040.241.51-5.80
GME (7bn)358445.147.445.345.94-1.37
Open AI Text Embedding Large153643.748.442.144.75-2.56
Cohere - Embed V3102445.747.442.045.04-2.27
Cohere - Embed V4102448.649.643.747.310.00

ViDoRe Benchmark v2

ViDoRe Benchmark v2 is a comprehensive evaluation suite for visual document retrieval systems. It features the following multilingual and multimodal setups: The benchmark spans diverse domains, including biomedical research, economics, and environmental, social, and governance (ESG) reports. The corpora are sourced from publicly available documents such as academic papers, government reports, and industry publications. Queries are generated through a hybrid approach of synthetic generation and human-in-the-loop refinement, ensuring they reflect realistic and complex information needs.
  1. Axa Insurance contains Axa Group Insurance Policies documents in French and the queries are in the following languages: English, French, German, and Spanish
  2. MiT Biomedical contains MiT Anatomy Course Lecture Slides in English and the queries are in the following languages: English, French, German, and Spanish
  3. RSE Restuarant contains ESG reports from companies in the Fast Casual / Quick Serve Industry in English and the queries are in the following languages: English, French, German, and Spanish
  4. Synthetic Macro contains World Economic reports in Englishand the queries are in the following languages: English, French, German, and Spanish
ViDoRe Benchmark v2 is designed to evaluate the performance of retrieval models in handling visually rich documents across multiple languages and domains. It includes datasets like the MIT Tissue Interaction dataset, World Economic Reports, and ESG Reports, each presenting unique challenges in visual and multilingual retrieval. All metrics are NDCG@10
ModelNumber of DimensionsAxa InsuranceMiT BioMedicalRSE RestuarantSynthetic MacroAverageDifferential
ColQwenMultivector52.053.646.748.550.20-5.91
GME (2bn)153659.654.354.454.055.59-0.51
GME (7bn)358459.953.347.453.753.58-2.53
Cohere - Embed V3102452.755.354.550.153.15-2.96
Cohere - Embed V4153663.958.353.348.956.110.00

Industry Specific Datasets:

MPMQA

MPMQA is a specialized information retrieval dataset focused on product manuals. It features the following monolingual setup: English Queries to an English Product Manual Corpus The domain is technical support and product documentation, with the corpus sourced from real-world manuals across various consumer electronics and appliances. The queries are natural language questions derived from customer support scenarios, reflecting realistic user information needs. All metrics are NDCG@10
ModelNumber of DimensionsMPMQADifferential
ColQwenMutlivector67.1-6.48
GME (2bn)153655.9-17.69
GME (7bn)358453.5-20.06
Open AI Text Embedding Large153658.5-15.07
Cohere - Embed V3102456.7-16.90
Cohere - Embed V4102473.60.00

BioASQ

BioASQ is a standard biomedical information retrieval dataset. It features the following monolingual setup: English Queries to English Biomedical Corpus The domain is biomedical research, with the corpus sourced from PubMed abstracts. The queries are expert-annotated, derived from real biomedical questions. All metrics are NDCG@10
ModelNumber of DimensionsMPMQA NDCG@10Differential
ColQwenMutlivector72.82.46
GME (2bn)153646.4-23.99
GME (7bn)358449.7-20.66
Open AI Text Embedding Large153661.1-9.24
Cohere - Embed V3102444.5-25.89
Cohere - Embed V4102470.40.00

Finance Bench

​FinanceBench is a benchmark dataset designed for evaluating large language models (LLMs) in financial question answering. It features the following monolingual setup:​ English Queries to English Financial Documents (SEC Filings such as 10Ks, 10Qs, 8K Filings) The domain encompasses financial data, with the corpus containing information about publicly traded companies. The dataset comprises 10,231 questions that are ecologically valid, covering a diverse set of scenarios related to financial question answering. These questions are intended to be clear-cut and straightforward to answer, serving as a minimum performance standard for LLMs. All metrics are NDCG@10
ModelNumber of DimensionsFinanceBench
ColQwenMultivector14.4
GME (2bn)153625.0
GME (7bn)358437.1
Open AI Text Embedding Large153634.1
Cohere - Embed V3102432.8
Cohere - Embed V4153661.4
Model Specifications
Context Length131072
LicenseCustom
Last UpdatedMay 2025
Input TypeImage,Text
Output TypeImage,Text
PublisherCohere
Languages10 Languages