OpenAI gpt-audio
Version: 2025-08-28
gpt-audio enables voice-based interaction by processing spoken prompts and generating responses, capturing subtle audio cues for deeper, more immersive experiences.
Note: For customers interested in lower latency audio responses,
gpt-realtime
may still be more suitable.
These audio features can be utilized in various ways:
- Create spoken summaries from text, offering a more engaging method to present information.
- Analyze the sentiment of audio recordings, converting vocal nuances into text-based insights.
- Facilitate asynchronous speech-in, speech-out interactions
Model provider
This model is provided through the Azure OpenAI Service.Relevant documents
The following documents are applicable:Model Specifications
Context Length128000
LicenseCustom
Training DataAugust 2025
Last UpdatedSeptember 2025
Input TypeAudio,Text
Output TypeAudio,Text
PublisherOpenAI
Languages27 Languages