embed-v-4-0
Embed 4 transforms texts and images into numerical vectorsCohere’s Embed 4 is a multilingual multimodal embedding model. It is capable of transforming different modalities such as images, texts, and interleaved images and texts into a single vector representation. Embed 4 offers state-of-the-art performance across all modalities (texts, images, interleaved texts and image) and in both English and multilingual settings.
Embed 4 supports a 128k context length and an images can have a maximum of 2MM pixels. Embed 4 is capable of vectorizing interleaved texts and images and capturing key visual features from screenshots of PDFs, slides, tables, figures, and more, thereby eliminating the need for complex document parsing. Embed 4 offers a variety of ways for compression both on the number of dimensions and the number-format precision. The model offers byte and binary quantization and matryoshka embeddings for further compression.
Quick facts
Model providerCohere
TypeEmbeddings, Summarization
LifecycleGenerally available (GA)
Input typeimage, text
Output typeimage, text
Context window131.072k
Token limits4096 output
PricingView pricing