Grok 3
Grok 3
Version: 1
xAILast updated October 2025
Grok 3 is xAI's debut model, pretrained by Colossus at supermassive scale to excel in specialized domains like finance, healthcare, and the law.
Understanding
Instruction
Summarization

Azure Direct Models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

Key capabilities

About this model

Grok 3 blends unparalleled intelligence with vast pretraining knowledge, honed on xAI's Colossus supercluster. The model comes equipped with enterprise features, such as:
  • Deep domain expertise: a strong world-knowledge in finance, healthcare, and the law.
  • Instruction-following: follows chain of command and is less likely to refuse queries.
  • Document reasoning: can process extensive and complicated professional documents.
Grok 3 is purpose-built to be the workhorse model for the enterprise, providing a strong foundation for business workflows in any professional field.

Key model capabilities

  • Deep domain expertise: With deep domain expertise in finance, healthcare, law and science, Grok 3 excels at enterprise tasks like financial forecasting, medical diagnosis support, legal document analysis, and scientific research assistance—delivering precise, domain-specific solutions.
  • Extended Context Length: With an extended context length of up to 16k tokens (131K coming soon), Grok 3 processes and understands vast datasets in a single pass—ideal for comprehensive analysis of large documents or complex workflows.
  • Steerability & Chain of Command: Grok 3 is extremely steerable and follows instructions closely. The model is less likely to refuse queries, providing more helpful responses while maintaining safety and ethical standards.
  • Structured outputs: Grok 3 supports structured outputs, enabling developers to specify JSON schemas for AI-powered automations.
  • Functions and Tools support: Like other xAI models, Grok 3 model supports functions and external tools that enable enterprises to build agentic workflows.

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

Grok 3 is purpose-built for common business use cases like data extraction, coding, and text summarization. Grok 3 excels at enterprise tasks like financial forecasting, medical diagnosis support, legal document analysis, and scientific research assistance—delivering precise, domain-specific solutions.

Out of scope use cases

The provider has not supplied this information.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

The provider has not supplied this information.

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

The provider has not supplied this information.

Output formats

Grok 3 supports structured outputs, enabling developers to specify JSON schemas for AI-powered automations.

Supported languages

English, Spanish, French, Afrikaans, Arabic, Bengali, Welsh, German, Greek, Indonesian, Icelandic, Italian, Japanese, Korean, Latvian, Marathi, Nepali, Punjabi, Polish, Russian, Swahili, Telugu, Thai, Turkish, Ukrainian, Urdu, and Chinese.

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

Grok 3 supports a 131,072 token context window, enabling it to process and generate responses for extensive inputs while maintaining coherence and depth.

Optimizing model performance

The provider has not supplied this information.

Additional assets

The provider has not supplied this information.

Training disclosure

Training, testing and validation

Trained on a diverse dataset emphasizing high-quality, reasoning-rich content, it is particularly strong at drawing connections across domains and languages.

Distribution

Distribution channels

The provider has not supplied this information.

More information

Responsible AI considerations

Safety techniques

The provider has not supplied this information.

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: xAI To evaluate Grok 3's capabilities, xAI compared its performance against a set of models across various benchmarks using their internal benchmark platform. Below is a high-level overview of Grok 3's quality on representative benchmarks.
CategoryBenchmarkGrok 3 Score (%)
Math CompetitionAIME 202460.0
Graduate-Level ReasoningGPQA79.1
Code GenerationLiveCodeBench65.5
Multi-Task Language UnderstandingMMLU-Pro83.1
FactualitySimpleQA44.5
Instruction FollowingIFEval91.1
Agentic ShoppingTauBench-Retail77.4
Agentic Flight BookingTauBench-Airline43.0
Average68.0
State-of-the-Art Performance: Grok 3 achieves top-tier results among non-reasoning models on diverse academic benchmarks, including: Graduate-level science knowledge (GPQA), General knowledge (MMLU-Pro), and Math competition problems (AIME). Document Processing: Grok 3 excels at processing extensive documents and handling complex prompts while maintaining high instruction-following accuracy. Factuality and Style: Grok 3 demonstrates improved factual accuracy and enhanced stylistic control.

Benchmarking methodology

Source: xAI The provider has not supplied this information.

Public data summary

Source: xAI The provider has not supplied this information.
Model Specifications
Context Length131072
Quality Index0.85
LicenseCustom
Last UpdatedOctober 2025
Input TypeText
Output TypeText
ProviderxAI
Languages27 Languages