azure-testing-random-gated-chat-completion
Version: 3
Invisible
Gated Model Access Required
azure-testing/random-gated-chat-completion is not visible on the Microsoft Foundry and Azure Machine Learning catalogs until 2026-01-16T00:00:00.000Z, when it will automatically switch to visible i.e., public, unless either the invisibleUntil tag is removed or the invisibleUntil date is modified.
Gated Model Access Required
azure-testing/random-gated-chat-completion requires special access approval from the authors through Hugging Face. To use this model, you must:
- Request access through the model page on Hugging Face and wait for approval from the model authors.
- Create a Custom keys workspace connection in Microsoft Foundry or Azure Machine Learning named
HuggingFaceTokenConnectionwith the keyHF_TOKENand value your Hugging Face read or fine-grained token (marked as secret). - Create the Managed Online Endpoint with the property
enforce_access_to_default_secret_storesset toenabledso it can access the secret connection value. - Once access is approved, the connection is configured, and the endpoint is created with read access to the token, you can deploy and use the model in Microsoft Foundry or Azure Machine Learning.
azure-testing/random-gated-chat-completion powered by vLLM
Chat Completions API
Send Request
You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.curl <AZUREML_ENDPOINT_URL> \
-X POST \
-H "Authorization: Bearer <AZUREML_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"model":"azure-testing/random-gated-chat-completion","messages":[{"role":"user","content":"What is Deep Learning?"}]}'
Supported Parameters
The following are the only mandatory parameters to send in the HTTP POST request tov1/chat/completions.
- model (string): Model ID used to generate the response, in this case since only a single model is deployed within the same endpoint you can either set it to azure-testing/random-gated-chat-completion or leave it blank instead.
- messages (array): A list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, images, and audio.
/openapi.json for the current Azure ML Endpoint.
Example payload
{
"model": "azure-testing/random-gated-chat-completion",
"messages": [
{"role":"user","content":"What is Deep Learning?"}
],
"max_completion_tokens": 256,
"temperature": 0.6
}
Responses API
Alternatively, given thatazure-testing/random-gated-chat-completion is a reasoning model, note that the recommended API is the OpenAI Responses API over the default OpenAI Chat Completions API aforementioned.
Send Request
curl <AZUREML_ENDPOINT_URL>/v1/responses \
-X POST \
-d '{"model":"azure-testing/random-gated-chat-completion","input":"What is Deep Learning?","reasoning":{"effort":"medium"}}' \
-H "Authorization: Bearer <AZUREML_TOKEN>" \
-H "Content-Type: application/json"
Supported Parameters
This being said, the following are the only mandatory parameters to send in the HTTP POST request to/v1/responses.
- model (string): Model ID used to generate the response, in this case since only a single model is deployed within the same endpoint you can either set it to azure-testing/random-gated-chat-completion or leave it blank instead.
- input (str or array): A text, image, or file inputs to the model, or even a list of messages comprising the conversation so far, used to generate the response. Depending on the model you use, different message types (modalities) are supported, like text, images, and audio; whilst in this case only text generation is supported so image and audio inputs are disallowed.
/openapi.json for the current Azure ML Endpoint.
Example Payload
{
"model": "azure-testing/random-gated-chat-completion",
"input": "What is Deep Learning?",
"max_output_tokens": 1024,
"temperature": 0.6,
"reasoning": {
"effort": "medium"
}
}
Model Specifications
LicenseMit
Last UpdatedJanuary 2026
ProviderHugging Face