jinaai-readerlm-v2
jinaai-readerlm-v2
Version: 1
HuggingFaceLast updated March 2025
jinaai/ReaderLM-v2 powered by Text Generation Inference (TGI)

Send Request

You can use cURL or any REST Client to send a request to the AzureML endpoint with your AzureML token.
curl <AZUREML_ENDPOINT_URL> \
    -X POST \
    -d '{"inputs":"What is Deep Learning?"}' \
    -H "Authorization: Bearer <AZUREML_TOKEN>" \
    -H "Content-Type: application/json"

Supported Parameters

You can use different parameters to control the generation, defining them in the parameters attribute of the payload. As of today, the following parameters are supported:
  • temperature: Controls randomness in the model. Lower values will make the model more deterministic and higher values will make the model more random. Default value is 1.0.
  • max_new_tokens: The maximum number of tokens to generate. Default value is 20, max value is 512.
  • repetition_penalty: Controls the likelihood of repetition. Default is null.
  • seed: The seed to use for random generation. Default is null.
  • stop: A list of tokens to stop the generation. The generation will stop when one of the tokens is generated.
  • top_k: The number of highest probability vocabulary tokens to keep for top-k-filtering. Default value is null, which disables top-k-filtering.
  • top_p: The cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling, default to null
  • do_sample: Whether or not to use sampling; use greedy decoding otherwise. Default value is false.
  • best_of: Generate best_of sequences and return the one if the highest token logprobs, default to null.
  • details: Whether or not to return details about the generation. Default value is false.
  • return_full_text: Whether or not to return the full text or only the generated part. Default value is false.
  • truncate: Whether or not to truncate the input to the maximum length of the model. Default value is true.
  • typical_p: The typical probability of a token. Default value is null.
  • watermark: The watermark to use for the generation. Default value is false.
Example payload
{
  "inputs": "What is Deep Learning?",
  "parameters": {
    "do_sample": true,
    "top_p": 0.95,
    "temperature": 0.2,
    "top_k": 50,
    "max_new_tokens": 256,
    "repetition_penalty": 1.03,
    "stop": ["\nUser:", "<|endoftext|>", "</s>"]
  }
}
Model Specifications
LicenseCc-by-nc-4.0
Last UpdatedMarch 2025
PublisherHuggingFace