qwen-qwen3-235b-a22b-instruct-2507-fp8
Version: 3
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8
powered by vLLM
Send Request
You can use cURL or any REST Client to send a request to the AzureML endpoint with your AzureML token.curl <AZUREML_ENDPOINT_URL> \
-X POST \
-d '{"model":"Qwen/Qwen3-235B-A22B-Instruct-2507-FP8","messages":[{"role":"user","content":"What is Deep Learning?"}]}' \
-H "Authorization: Bearer <AZUREML_TOKEN>" \
-H "Content-Type: application/json"
Supported Parameters
The following are the only mandatory parameters to send in the HTTP POST request tov1/chat/completions
.
- model (string): Model ID used to generate the response, in this case since only a single model is deployed within the same endpoint you can either set it to Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 or leave it blank instead.
- messages (array): A list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, images, and audio; whilst in this case only text generation is supported so image and audio inputs are disallowed.
/openapi.json
for the current Azure ML Endpoint.
Example payload
{
"model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",
"messages": [
{"role":"user","content":"What is Deep Learning?"}
],
"max_completion_tokens": 256,
"temperature": 0.6
}
Model Specifications
LicenseApache-2.0
Last UpdatedJuly 2025
PublisherHuggingFace