Qwen/Qwen3.59B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token. ba
Qwen/Qwen3.535BA3B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML tok
CohereLabs/commandaplus052026w4a4 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azur
CohereLabs/commandaplus052026bf16 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azur
Qwen/Qwen3.627B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.
Qwen/Qwen3.635BA3B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML tok
Invisible google/gemma431Bit is not visible on the Microsoft Foundry and Azure Machine Learning catalogs until 20260414T17:00:00.000Z, when it will automatically switch to visible i.e., public, unless either the invisibleUntil tag is removed or the invisibleUntil date is modified.
Invisible google/gemma4E4Bit is not visible on the Microsoft Foundry and Azure Machine Learning catalogs until 20260414T17:00:00.000Z, when it will automatically switch to visible i.e., public, unless either the invisibleUntil tag is removed or the invisibleUntil date is modified.
unsloth/Qwen3.635BA3BGGUF with UDIQ3S quantization, powered by llama.cpp Original Model Card llama.cpp Documentation GGUF GGUF is a binary file format optimized for quick loading and
TongyiMAI/ZImageTurbo powered by Hugging Face API Original Model Card texttoimage Task on Hugging Face Send Request You can use cURL or any REST Client to send a request to the Azure
openai/whisperlargev3 powered by Hugging Face Inference Toolkit Original Model Card automaticspeechrecognition Task on Hugging Face Send Request You can use cURL or any R
Invisible google/gemma426BA4Bit is not visible on the Microsoft Foundry and Azure Machine Learning catalogs until 20260414T17:00:00.000Z, when it will automatically switch to visible i.e., public, unless either the invisibleUntil tag is removed or the invisibleUntil date is modified
Gated Model Access Required facebook/sam3 requires special access approval from the authors through Hugging Face. To use this model, you must: 1. Request access through the model page on Hugging Face and wait for approval from the model authors. 2. [Cr
CohereLabs/commandaplus052026fp8 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure
sentencetransformers/allMiniLML6v2 powered by Text Embeddings Inference Original Model Card Text Embeddings Inference Documentation [featureextraction Task
BAAI/bgem3 powered by Text Embeddings Inference (TEI) Original Model Card Text Embeddings Inference Documentation Send Request You can use cURL or any REST Client to send a request to the Azu
Invisible google/gemma4E2Bit is not visible on the Microsoft Foundry and Azure Machine Learning catalogs until 20260414T17:00:00.000Z, when it will automatically switch to visible i.e., public, unless either the invisibleUntil tag is removed or the invisibleUntil date is modified.
Model Card for openai/gptoss20b in Azure <p align="center" <img alt="gptoss20b" src="https://raw.githubusercontent.com/openai/gptoss/main/docs/gptoss20b.svg" </p <p align="center" <a href="https://gptoss.com"<strongTry gptoss</strong</a � <a href="https://cookbook.openai.c
stabilityai/stablediffusionxlbase1.0 powered by Hugging Face API Original Model Card texttoimage Task on Hugging Face Send Request You can use cURL or any REST Client
nvidia/parakeettdt0.6bv3 powered by Hugging Face + NeMo Original Model Card Send Request You can use cURL or any REST Client to send a request to the AzureML endpoint with your AzureML token. bash curl <AZUREMLENDPOINTURL \
nvidia/Nemotron3NanoOmni30BA3BReasoningBF16 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to se
Model Card for openai/gptoss120b in Azure <p align="center" <img alt="gptoss120b" src="https://raw.githubusercontent.com/openai/gptoss/main/docs/gptoss120b.svg" </p <p align="center" <a href="https://gptoss.com"<strongTry gptoss</strong</a � <a href="https://cookbook.opena
Qwen/Qwen3.54B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token. ba
Qwen/QwenImage2512 powered by Hugging Face API Original Model Card texttoimage Task on Hugging Face Send Request You can use cURL or any REST Client to send a request to the Azure ML endpo
Qwen/Qwen30.6B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token. bas
TongyiMAI/ZImage powered by Hugging Face API Original Model Card texttoimage Task on Hugging Face Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint
Gated Model Access Required blackforestlabs/FLUX.1schnell requires special access approval from the authors through Hugging Face. To use this model, you must: 1. Request access through the model page on Hugging Face and wait for a
Qwen/Qwen3Embedding0.6B powered by Text Embeddings Inference Original Model Card Text Embeddings Inference Documentation [featureextraction Task on Hugging Face](https://h
NousResearch/Hermes3Llama3.18B powered by Text Generation Inference (TGI) Example Notebook Original Model Card [Text Generation Inference Documentation](
Qwen/Qwen2.57BInstruct powered by Text Generation Inference (TGI) Example Notebook Original Model Card [Text Generation Inference Documentation](https://huggingface.
Qwen/Qwen3ASR1.7B powered by vLLM Original Model Card vLLM Documentation Transcriptions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.
Qwen/WebWorld8B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.
Qwen/WebWorld32B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.
Qwen/Qwen3.627BFP8 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML tok
openaicommunity/gpt2 powered by Text Generation Inference (TGI) Example Notebook Original Model Card [Text Generation Inference Documentation](https://huggingface.co/doc
openai/whisperlargev3turbo powered by Hugging Face Inference API Original Model Card automaticspeechrecognition Task on Hugging Face Send Request You can use cURL
Qwen/Qwen3Coder30BA3BInstruct powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoi
Qwen/Qwen3.52B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token. ba
lordx64/Qwen3.635BA3BClaude4.7OpusReasoningDistilled powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any
nvidia/Nemotron3NanoOmni30BA3BReasoningNVFP4 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to
metallama/MetaLlama38BInstruct powered by Text Generation Inference. Example Notebook Original Model Card Send Request You can use cURL or any REST
dphn/DolphinMistral24BVeniceEdition powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azu
NousResearch/Hermes4.336B powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your
unsloth/Qwen3.627BNVFP4 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Az
mistralai/Mistral7BInstructv0.2 powered by Text Generation Inference. Example Notebook Original Model Card Send Request You can use cURL or any REST Cl
Qwen/Qwen2.5VL7BInstruct powered by Text Generation Inference (TGI) Example Notebook Original Model Card [Text Generation Inference Documentation](https://huggin
nvidia/Gemma431BITNVFP4 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with you
googlebert/bertbaseuncased powered by Hugging Face Inference Toolkit Original Model Card fillmask Task on Hugging Face Send Request You can use cURL or any REST Client to send a reque
sentencetransformers/paraphrasemultilingualMiniLML12v2 powered by Text Embeddings Inference Original Model Card [Text Embeddings Inference Documentation](https://huggingface.co/docs/textembeddingsinfe
Qwen/Qwen3CoderNext powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML t
Qwen/Qwen3.635BA3BFP8 powered by vLLM Original Model Card vLLM Documentation Chat Completions API Send Request You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azur