baai-bge-reranker-v2-m3
baai-bge-reranker-v2-m3
Version: 6
HuggingFaceLast updated August 2025
BAAI/bge-reranker-v2-m3 powered by Text Embeddings Inference (TEI)

Send Request

You can use cURL or any REST Client to send a request to the AzureML endpoint with your AzureML token. Note that since BAAI/bge-reranker-v2-m3 is a text-ranking model, you need to use the /rerank route instead of the default scoring route which defaults to the /embed route, used to generate embeddings.
curl <AZUREML_ENDPOINT_URL>/rerank \
    -X POST \
    -d '{"query":"What is Deep Learning?","texts":["Deep Learning is...","Deep Learning is not..."]}' \
    -H "Authorization: Bearer <AZUREML_TOKEN>" \
    -H "Content-Type: application/json"

Supported Parameters

  • query (string): The input query.
  • raw_scores (bool, optional): Whether to return the raw scores or the normalized ones. If false, the score probabilities will be return instead, otherwise the raw unnormalized scores (real numbers) will be returned instead. Defaults to False.
  • return_text (bool, optional): Whether to return the texts along with each ranking result or not. Defaults to False.
  • texts (array): A list of texts against which the query will be ranked.
  • truncate (bool, optional): Truncate the inputs that are longer than the maximum supported size. Defaults to False.
  • truncation_direction ('left' or 'right', optional): Can either be "left" or "right", defaults to "right". Truncating to the "right" means that tokens are removed from the end of the sequence until the maximum supported size is matched, whilst truncating to the "left" means from the beginning of the sequence.

Example payload

{
  "query": "What is Deep Learning?",
  "raw_scores": false,
  "return_text": false,
  "texts": [
    "Deep Learning is...",
    "Deep Learning is not..."
  ],
  "truncate": true,
  "truncation_direction": "right"
}
Model Specifications
LicenseApache-2.0
Last UpdatedAugust 2025
PublisherHuggingFace