Cohere-rerank-v4.0-fast
Rerank improves search systems by sorting documents based on their semantic similarity to a query
Rerank v4.0 Fast can be added to existing systems, whether keyword or semantic, to improve performance.
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Microsoft Foundry platform.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
About this model
Cohere's Rerank v4.0 Fast endpoint enables businesses to significantly improve search and retrieval-augmented generation systems. As input, it takes a query and a list of potentially relevant documents. Rerank v4.0 Fast then returns the documents as a list sorted by semantic similarity to the provided query. As an intelligent cross-encoding AI model, Rerank v4.0 Fast is able to understand the meaning behind enterprise data and user questions. Rerank v4.0 Fast can be implemented with just a few lines of code, delivers leading performance across over 100 languages, and is uniquely capable of understanding complex information which requires reasoning. These attributes make Rerank v4.0 Fast particularly well suited for global organizations within Finance, Healthcare, Energy, Government, and Manufacturing.Rerank v4.0 Fast can be added to existing systems, whether keyword or semantic, to improve performance.
Key model capabilities
Quick facts
Model providerCohere
TypeText classification
LifecycleGenerally available (GA)
Input typetext, image
Output typetext, image
Context window4096
Token limits2048 output
PricingView pricing