gpt-5 is designed for logic-heavy and multi-step tasks.
gpt-5-mini is a lightweight version for cost-sensitive applications.
gpt-5-nano is optimized for speed, ideal for applications requiring low latency.
gpt-5-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.
Generate images with amazing image quality, prompt adherence, and diversity at blazing fast speeds. FLUX1.1 [pro] delivers six times faster image generation and achieved the highest Elo score on Artificial Analysis benchmarks when launched, surpassing all
Generate and edit images through both text and image prompts. FLUX.1 Kontext is a multimodal flow matching model that enables both text-to-image generation and in-context image editing. Modify images while maintaining character consistency and performing l
The o3 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers.
codex-mini is a fine-tuned variant of the o4-mini model, designed to deliver rapid, instruction-following performance for developers working in CLI workflows. Whether you're automating shell commands, editing scripts, or refactoring repositories, Codex-Min
The DeepSeek R1 0528 model has improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.
An efficient AI solution to generate videos
Grok 3 is xAI's debut model, pretrained by Colossus at supermassive scale to excel in specialized domains like finance, healthcare, and the law.
Grok 3 Mini is a lightweight model that thinks before responding. Trained on mathematic and scientific problems, it is great for logic-based tasks.
Model router is a deployable AI model that is trained to select the most suitable large language model (LLM) for a given prompt.
o3 includes significant improvements on quality and safety while supporting the existing features of o1 and delivering comparable or better performance.
o4-mini includes significant improvements on quality and safety while supporting the existing features of o3-mini and delivering comparable or better performance.
MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to fill in information gaps in the previous version of the model and improve its harm protections while maintaining R1 reasoning capabilities.
An efficient AI solution for diverse text and image tasks, including text to image, image to image, inpainting, and prompt transformation.
gpt-4.1 outperforms gpt-4o across the board, with major gains in coding, instruction following, and long-context understanding
gpt-4.1-mini outperform gpt-4o-mini across the board, with major gains in coding, instruction following, and long-context handling
gpt-4.1-nano provides gains in coding, instruction following, and long-context handling along with lower latency and cost
Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.
Microsoft Research's EvoDiff is a diffusion modeling framework capable of generating highfidelity, diverse, and novel proteins with the option of conditioning according to sequence constraints. Because it operates in the universal protein design space, EvoDiff can unconditionally sample diverse str
State-of-the-art open-weight reasoning model.
Lightweight math reasoning model optimized for multi-step problem solving
Llama 4 Scout 17B 16E Instruct is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.
Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B
Command A is a highly efficient generative model that excels at agentic and multilingual use cases.
Embed 4 transforms texts and images into numerical vectors
the largest and strongest general purpose model in the gpt model family up to date, best suited for diverse text and image tasks.
o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.
DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.
Llama 4 Scout 17B 16E is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.
An advanced text-to-speech solution designed to convert written text into natural-sounding speech.
A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts.
A highly efficient and cost effective speech-to-text solution that deliverables reliable and accurate transcripts.
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
DeepSeek-R1 excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.
3.8B parameters Small Language Model outperforming larger models in reasoning, math, coding, and function-calling
First small multimodal model to have 3 modality inputs (text, audio, image), excelling in quality and efficiency
Phi-4 14B, a highly capable model for low latency scenarios.
Document conversion to markdown with interleaved images and text
Enhanced Mistral Small 3 with multimodal capabilities and a 128k context length.
Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Focused on advanced reasoning and solving complex problems, including math and science tasks. Ideal for applications that require deep contextual understanding and agentic workflows.
Smaller, faster, and 80% cheaper than o1-preview, performs well at code generation and small context operations.
OpenAI's most advanced multimodal model in the gpt-4o family. Can handle both text and image inputs.
An affordable, efficient AI solution for diverse text and image tasks.
Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
The gpt4orealtimepreview model introduces a new era in AI interaction by incorporating the new audio modality powered by gpt4o. This new modality allows for seamless speechtospeech and texttospeech applications, providing a richer and more engaging user experience. Engineered for speed and e