gpt-4o

gpt-4o

OpenAI's most advanced multimodal model in the gpt-4o family. Can handle both text and image inputs.
Azure OpenAI
Direct from Azure
Version: 2024-11-20
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

About this model

As measured on traditional benchmarks, gpt-4o achieves gpt-4 turbo-level performance on text, reasoning, and coding intelligence, while setting new high watermarks on multilingual, audio, and vision capabilities.

Key model capabilities

  • Text, image processing
  • JSON Mode
  • parallel function calling
  • Enhanced accuracy and responsiveness
  • Parity with English text and coding tasks compared to GPT-4 Turbo with Vision
  • Superior performance in non-English languages and in vision tasks
  • Support for enhancements
  • Support for complex structured outputs.
ModelMMLUGPQAMATHMGSMDROPHumanEval
GPT-4o (2024-08-06)88.753.676.690.583.490.2
GPT-4T86.548.072.688.586.087.1
GPT-486.435.742.574.580.967.0
Claude3 Opus86.850.460.190.783.184.9
Gemini Pro 1.581.9--58.588.778.971.9
Gemini Ultra 1.083.7--53.279.082.474.4
Llama3 400b86.148.057.8--83.584.1
Source: the OpenAI announcement .

Quick facts

Model providerAzure OpenAI
TypeChat completion, Responses
LifecycleGenerally available (GA)
Input typetext, image, audio
Output typetext
Context window131.072k
Token limits16384 output

Related Models