voyage-3.5-embedding-model

voyage-3.5-embedding-model

Text embedding model optimized for general-purpose (including multilingual) retrieval/search and AI applications. 32K context length.
Voyage AI
Version: 2

About this model

Text embedding models are neural networks that transform texts into numerical vectors. They are a crucial building block for semantic search/retrieval systems and retrieval-augmented generation (RAG) and are responsible for the retrieval quality. voyage-3.5 is a state-of-the-art general-purpose and multilingual embedding model that outperforms OpenAI-v3-large by 8.26% on average across evaluated domains. Enabled by Matryoshka learning and quantization-aware training, voyage-3.5 supports smaller dimensions and int8 and binary quantization that dramatically reduce vectorDB costs with minimal impact on retrieval quality. Learn more about voyage-3.5 here: https://blog.voyageai.com/2025/05/20/voyage-3-5/

Key model capabilities

  • Optimized for general-purpose and multilingual retrieval quality, outperforming OpenAI v3 large by 8.26% on average across evaluated domains. Compared with OpenAI-v3-large (float, 3072), voyage-3.5 (int8, 2048) reduces vector database costs by 83%, while achieving higher retrieval quality.
  • Supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization, including float (32-bit floating point), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8).
  • 32K token context length, well-suited for applications on long documents.

Usage

The deployed Azure AI Foundry endpoint exposes the Voyage inference API. Authenticate with your Azure ML endpoint key or a bearer token issued for the workspace.

Generate Embeddings

curl <AZUREML_ENDPOINT_URL>/embeddings \
  -X POST \
  -H "Authorization: Bearer <AZUREML_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"input":["Sample text to embed"],"model":"voyage-3.5"}'

Supported Parameters

  • input (string or array of strings, required): Text(s) to embed.
  • model (string, required): voyage-3.5.
  • input_type (string, optional): query or document. Tunes embeddings for retrieval.
  • output_dimension (int, optional): One of 2048, 1024, 512, 256. Defaults to 1024.
  • output_dtype (string, optional): float, int8, uint8, binary, or ubinary. Defaults to float.
  • truncation (bool, optional): Truncate inputs longer than the 32K-token context. Defaults to true.
  • encoding_format (string, optional): Set to base64 to receive embeddings as base64-encoded strings instead of float arrays.
See the full API reference at https://docs.voyageai.com/reference/embeddings-api .

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.00068755, 0.03410244, -0.02404458, 0.04494607]
    }
  ],
  "model": "voyage-3.5",
  "usage": { "total_tokens": 4 }
}
The embedding array contains the full vector at the requested output_dimension (shown truncated above). When encoding_format is base64, each embedding is returned as a base64 string instead of a float array.

Quick facts

Model providerVoyage AI
TypeEmbeddings
LifecycleGenerally available (GA)
Input typetext
Output typetext
Context window32000