Azure-Speech-Voice-Live
Voice Live API is a single unified API that enables low-latency, high-quality speech to speech interactions for voice agents.
Azure Speech is a comprehensive suite of AI-powered speech capabilities that includes speech to text, text to speech, speech translation, and voice live AI. It enables developers to build intelligent voice-enabled applications with high accuracy, multilingual support, and customizable voice experiences.
About this model
Voice Live API is designed for developers seeking scalable and efficient voice-driven experiences as it eliminates the need to manually orchestrate multiple components. By integrating speech recognition, generative AI, and text to speech functionalities into a single, unified interface, it provides an end-to-end solution for creating seamless voice conversation experiences.Key model capabilities
Voice Live API includes a comprehensive set of features to support diverse use cases and ensure superior voice interactions: Broad language coverage, customizable speech input and output, flexible GenAI model options, advanced noise suppression, echo cancelation and semantic VAD, avatar, and function calling.Quick facts
Model providerMicrosoft
TypeConversational AI, Speech to text, Text to speech
LifecycleGenerally available (GA)
Input typetext, audio
Output typetext, audio
PricingView pricing