OpenAI gpt-4o-mini-realtime-preview
OpenAI gpt-4o-mini-realtime-preview
Version: 2024-12-17
OpenAILast updated February 2025
Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
The GPT-4o-mini-realtime-preview model introduces a smaller, lower cost model to power realtime speech applications. Like GPT-4o-realtime-preview, GPT-4o-mini-realtime-preview provides a richer and more engaging user experience, at a fraction of the cost. The introduction of GPT-4o-mini-realtime-preview opens numerous possibilities for businesses in various sectors: Enhanced customer service: By integrating audio inputs, GPT-4o-mini-realtime-preview enables more dynamic and comprehensive customer support interactions. Content innovation: Use GPT-4o-mini-realtime-preview's generative capabilities to create engaging and diverse audio content, catering to a broad range of consumer preferences. Real-time translation: Leverage GPT-4o-mini-realtime-preview's capability to provide accurate and immediate translations, facilitating seamless communication across different languages.

Model Versions

2024-12-17: Introducing our new multimodal AI model, which now supports both text and audio modalities. As this is a preview version, it is designed for testing and feedback purposes and is not yet optimized for production traffic.

Limitations

Currently, the GPT-4o-mini-realtime-preview model focuses on text and audio and does not support existing GPT-4o features such as image modality and structured outputs. For many tasks, the generally available GPT-4o-mini models may still be more suitable. IMPORTANT: At this time, GPT-4o-mini-realtime-preview usage limits are suitable for test and development. To prevent abuse and preserve service integrity, rate limits will be adjusted as needed.

Model Provider

This model is provided through the Azure OpenAI service.

Relevant Documents

The following documents are applicable:

Responsible AI Considerations

GPT-4o-mini-realtime-preview has safety built-in by design across modalities, through techniques such as filtering training data and refining the model's behavior through post-training. We have also created new safety systems to provide guardrails on voice outputs. We've evaluated GPT-4o-mini-realtime-preview according to our Preparedness Framework and in line with our voluntary commitments. Our evaluations of cybersecurity, CBRN, persuasion, and model autonomy show that GPT-4o-mini-realtime-preview does not score above Medium risk in any of these categories. This assessment involved running a suite of automated and human evaluations throughout the model training process. We tested both pre-safety-mitigation and post-safety-mitigation versions of the model, using custom fine-tuning and prompts, to better elicit model capabilities. GPT-4o-mini-realtime-preview has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities. We used these learnings to build out our safety interventions in order to improve the safety of interacting with GPT-4o-mini-realtime-preview. We will continue to mitigate new risks as they're discovered.
Model Specifications
Context Length128000
LicenseCustom
Training DataOctober 2023
Last UpdatedFebruary 2025
Input TypeAudio,Text
Output TypeAudio,Text
PublisherOpenAI
Languages27 Languages