Cohere-rerank-v4.0-fast

Cohere-rerank-v4.0-fast

Rerank improves search systems by sorting documents based on their semantic similarity to a query
Cohere
Direct from Azure
Version: 2
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Microsoft Foundry platform.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

About this model

Cohere's Rerank v4.0 Fast endpoint enables businesses to significantly improve search and retrieval-augmented generation systems. As input, it takes a query and a list of potentially relevant documents. Rerank v4.0 Fast then returns the documents as a list sorted by semantic similarity to the provided query. As an intelligent cross-encoding AI model, Rerank v4.0 Fast is able to understand the meaning behind enterprise data and user questions. Rerank v4.0 Fast can be implemented with just a few lines of code, delivers leading performance across over 100 languages, and is uniquely capable of understanding complex information which requires reasoning. These attributes make Rerank v4.0 Fast particularly well suited for global organizations within Finance, Healthcare, Energy, Government, and Manufacturing.
Rerank v4.0 Fast can be added to existing systems, whether keyword or semantic, to improve performance.

Key model capabilities

Quick facts

Model providerCohere
TypeText classification
LifecycleGenerally available (GA)
Input typetext, image
Output typetext, image
Context window4096
Token limits2048 output