Grok 3 Mini
Grok 3 Mini
Version: 1
xAILast updated October 2025
Grok 3 Mini is a lightweight model that thinks before responding. Trained on mathematic and scientific problems, it is great for logic-based tasks.
Agents
Reasoning
Coding

Azure Direct Models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

Key capabilities

About this model

Grok 3 Mini delivers state-of-the-art results across diverse academic benchmarks among non-reasoning models, including: Graduate-level science knowledge (GPQA), General knowledge (MMLU-Pro), and Math competition problems (AIME).

Key model capabilities

  • Extended Context Length: With an extended context length of up to 16k tokens (131K coming soon), Grok 3 Mini processes and understands vast datasets in a single pass—ideal for comprehensive analysis of large documents or complex workflows.
  • Exposed Reasoning Tokens: Unlike traditional black-box thinking models, Grok 3 Mini offers unparalleled transparency, letting its users inspect its reasoning tokens. This transparency is a game-changer for enterprises and educators needing to understand the "why" behind answers—reflecting xAI's commitment to openness.
  • Steerability & Chain of Command: Grok 3 Mini is extremely steerable and follows instructions closely. The model is less likely to refuse queries, providing more helpful responses while maintaining safety and ethical standards.
  • Reasoning effort parameter: For more fine grained control over the model's performance, Grok 3 Mini supports the reasoning effort parameter, which allows users to adjust the model's thinking effort with options for low and high reasoning levels.
  • Structured outputs: Grok 3 Mini model supports structured outputs, enabling developers to specify JSON schemas for AI-powered automations.
  • Functions and Tools support: Similar to other xAI models, Grok 3 Mini supports functions and external tools that enable enterprises to build agentic workflows.
CategoryBenchmarkGrok 3 Mini (High) Score (%)
Math CompetitionAIME 202490.7
Graduate-Level ReasoningGPQA80.3
Code GenerationLiveCodeBench74.8
Multi-Task Language UnderstandingMMLU-pro82.8
Average82.2

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

The model is optimized for logic-based tasks, such as:
  • Coding environments: working inside codebases and local development environments.
  • Agentic workflows: building robust LLM ontologies and agent architectures.
  • Reasoning tasks: difficult mathematics and science based questions.

Out of scope use cases

The provider has not supplied this information.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

The model was trained via reinforcement learning with a focus on reasoning for agentic coding tasks, and excels at utilizing tools to solve complex logical problems in novel environments.

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

The provider has not supplied this information.

Output formats

Grok 3 Mini model supports structured outputs, enabling developers to specify JSON schemas for AI-powered automations.

Supported languages

English, Spanish, French, Afrikaans, Arabic, Bengali, Welsh, German, Greek, Indonesian, Icelandic, Italian, Japanese, Korean, Latvian, Marathi, Nepali, Punjabi, Polish, Russian, Swahili, Telugu, Thai, Turkish, Ukrainian, Urdu, and Chinese.

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

Grok 3 Mini supports a 131,072 token context window for understanding codebases and enterprise documents. With an extended context length of up to 16k tokens (131K coming soon), Grok 3 Mini processes and understands vast datasets in a single pass—ideal for comprehensive analysis of large documents or complex workflows.

Optimizing model performance

For more fine grained control over the model's performance, Grok 3 Mini supports the reasoning effort parameter, which allows users to adjust the model's thinking effort with options for low and high reasoning levels.

Additional assets

The provider has not supplied this information.

Training disclosure

Training, testing and validation

The provider has not supplied this information.

Distribution

Distribution channels

The provider has not supplied this information.

More information

Model developer: xAI Model Release Date: May 19, 2025

Responsible AI considerations

Safety techniques

The provider has not supplied this information.

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: xAI To understand its capabilities, xAI evaluated Grok 3 Mini (High) on a variety of benchmarks using their internal benchmarking platform. Grok 3 Mini (High) delivers state-of-the-art results across diverse academic benchmarks among non-reasoning models, including: Graduate-level science knowledge (GPQA), General knowledge (MMLU-Pro), and Math competition problems (AIME). Below is a high-level overview of the model quality on representative benchmarks:
CategoryBenchmarkGrok 3 Mini (High) Score (%)
Math CompetitionAIME 202490.7
Graduate-Level ReasoningGPQA80.3
Code GenerationLiveCodeBench74.8
Multi-Task Language UnderstandingMMLU-pro82.8
Average82.2

Benchmarking methodology

Source: xAI The provider has not supplied this information.

Public data summary

Source: xAI The provider has not supplied this information.
Model Specifications
Context Length131072
Quality Index0.87
LicenseCustom
Last UpdatedOctober 2025
Input TypeText
Output TypeText
ProviderxAI
Languages27 Languages