gpt-audio-mini
Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
About this model
gpt-audio-mini enables voice-based interaction by processing spoken prompts and generating responses, capturing subtle audio cues for deeper, more immersive experiences. Note: For customers interested in lower latency audio responses,gpt-realtime-mini may still be more suitable.
Key model capabilities
These audio features can be utilized in various ways:- Create spoken summaries from text, offering a more engaging method to present information.
- Analyze the sentiment of audio recordings, converting vocal nuances into text-based insights.
- Facilitate asynchronous speech-in, speech-out interactions
Quick facts
Model providerAzure OpenAI
TypeAudio generation
LifecycleGenerally available (GA)
Input typeaudio, text
Output typeaudio, text
Context window128k
Token limits16384 output
PricingView pricing