gpt-audio-mini

gpt-audio-mini

Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Azure OpenAI
Direct from Azure
Version: 2025-10-06
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

About this model

gpt-audio-mini enables voice-based interaction by processing spoken prompts and generating responses, capturing subtle audio cues for deeper, more immersive experiences. Note: For customers interested in lower latency audio responses, gpt-realtime-mini may still be more suitable.

Key model capabilities

These audio features can be utilized in various ways:
  • Create spoken summaries from text, offering a more engaging method to present information.
  • Analyze the sentiment of audio recordings, converting vocal nuances into text-based insights.
  • Facilitate asynchronous speech-in, speech-out interactions

Quick facts

Model providerAzure OpenAI
TypeAudio generation
LifecycleGenerally available (GA)
Input typeaudio, text
Output typeaudio, text
Context window128k
Token limits16384 output