Cohere Embed v3 Multilingual

Version: 1

Cohere•Last updated August 2025

Cohere Embed Multilingual is the market's leading text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering.

RAG

Models from Microsoft, Partners, and Community

Models from Microsoft, Partners, and Community models are a select portfolio of curated models both general-purpose and niche models across diverse scenarios by developed by Microsoft teams, partners, and community contributors

Managed by Microsoft: Purchase and manage models directly through Azure with a single license, world class support and enterprise grade Azure infrastructure
Validated by providers: Each model is validated and maintained by its respective provider, with Azure offering integration and deployment guidance.
Innovation and agility: Combines Microsoft research models with rapid, community-driven advancements.
Seamless Azure integration: Standard Azure AI Foundry experience, with support managed by the model provider.
Flexible deployment: Deployable as Managed Compute or Serverless API, based on provider preference.

Learn more about models from Microsoft, Partners, and Community

Key capabilities

About this model

Cohere Embed Multilingual is the market's leading multimodal (text, image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering.

Key model capabilities

Semantic search
Retrieval-augmented generation (RAG)
Classification
Clustering
Multimodal (text, image) representation
Cross-language search capabilities
Support for 100+ languages
Search within a language (e.g., search with a French query on French documents)
Search across languages (e.g., search with an English query on Chinese documents)

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

The provider has not supplied this information.

Out of scope use cases

Prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

The provider has not supplied this information.

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

The provider has not supplied this information.

Output formats

The provider has not supplied this information.

Supported languages

Embed Multilingual supports 100+ languages.

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

The provider has not supplied this information.

Optimizing model performance

The provider has not supplied this information.

Additional assets

Embed multilingual has SOTA performance on multilingual benchmarks such as Miracl and the multilingual evaluation results can be found in the following Embed v3.0 Miracl Evaluation Results and full MTEB results can be found in the following Embed v3.0 MTEB Evaluation Results . Evaluations against multi-modal embedding models can be found in the following Embed v3.0 Multimodal Evaluation Results .

Training disclosure

Training, testing and validation

This model was trained on nearly 1B English training pairs and nearly 0.5B Non-English training pairs from 100+ languages.

Distribution

Distribution channels

The provider has not supplied this information.

More information

The provider has not supplied this information.

Responsible AI considerations

Safety techniques

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: Cohere Embed multilingual has SOTA performance on multilingual benchmarks such as Miracl and the multilingual evaluation results can be found in the following Embed v3.0 Miracl Evaluation Results and full MTEB results can be found in the following Embed v3.0 MTEB Evaluation Results . Evaluations against multi-modal embedding models can be found in the following Embed v3.0 Multimodal Evaluation Results .

Benchmarking methodology

Source: Cohere The provider has not supplied this information.

Public data summary

Source: Cohere The provider has not supplied this information.

Model Specifications

Context Length512

LicenseCustom

Last UpdatedAugust 2025

Input TypeText

Output TypeEmbeddings

ProviderCohere

Languages10 Languages

Quick Start