voyage-3.5-embedding-model
Text embedding model optimized for general-purpose (including multilingual) retrieval/search and AI applications. 32K context length.
The
About this model
Text embedding models are neural networks that transform texts into numerical vectors. They are a crucial building block for semantic search/retrieval systems and retrieval-augmented generation (RAG) and are responsible for the retrieval quality. voyage-3.5 is a state-of-the-art general-purpose and multilingual embedding model that outperforms OpenAI-v3-large by 8.26% on average across evaluated domains. Enabled by Matryoshka learning and quantization-aware training, voyage-3.5 supports smaller dimensions and int8 and binary quantization that dramatically reduce vectorDB costs with minimal impact on retrieval quality. Learn more about voyage-3.5 here: https://blog.voyageai.com/2025/05/20/voyage-3-5/Key model capabilities
- Optimized for general-purpose and multilingual retrieval quality, outperforming OpenAI v3 large by 8.26% on average across evaluated domains. Compared with OpenAI-v3-large (float, 3072), voyage-3.5 (int8, 2048) reduces vector database costs by 83%, while achieving higher retrieval quality.
- Supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization, including float (32-bit floating point), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8).
- 32K token context length, well-suited for applications on long documents.
Usage
The deployed Azure AI Foundry endpoint exposes the Voyage inference API. Authenticate with your Azure ML endpoint key or a bearer token issued for the workspace.Generate Embeddings
curl <AZUREML_ENDPOINT_URL>/embeddings \
-X POST \
-H "Authorization: Bearer <AZUREML_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"input":["Sample text to embed"],"model":"voyage-3.5"}'
Supported Parameters
- input (string or array of strings, required): Text(s) to embed.
- model (string, required):
voyage-3.5. - input_type (string, optional):
queryordocument. Tunes embeddings for retrieval. - output_dimension (int, optional): One of
2048,1024,512,256. Defaults to1024. - output_dtype (string, optional):
float,int8,uint8,binary, orubinary. Defaults tofloat. - truncation (bool, optional): Truncate inputs longer than the 32K-token context. Defaults to
true. - encoding_format (string, optional): Set to
base64to receive embeddings as base64-encoded strings instead of float arrays.
Response
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.00068755, 0.03410244, -0.02404458, 0.04494607]
}
],
"model": "voyage-3.5",
"usage": { "total_tokens": 4 }
}
embedding array contains the full vector at the requested output_dimension (shown truncated above). When encoding_format is base64, each embedding is returned as a base64 string instead of a float array. Quick facts
Model providerVoyage AI
TypeEmbeddings
LifecycleGenerally available (GA)
Input typetext
Output typetext
Context window32000
PricingView pricing