Grok 4 Fast Reasoning
Version: 1
Direct from Azure models
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Key capabilities
About this model
Grok 4 Fast is designed for low-latency reasoning and tool-calling applications, excelling in conversational AI, API integrations, and agentic workflows requiring near-Grok 4 capabilities at reduced cost.Key model capabilities
It supports general-purpose tasks like query response, factual answering, and tool use (e.g., code execution, web search) on platforms like grok.com, x.com, and mobile apps. Its efficiency makes it ideal for high-throughput scenarios such as real-time chat, content generation, and lightweight automation.Use cases
See Responsible AI for additional considerations for responsible use.Key use cases
Grok 4 Fast is designed for low-latency reasoning and tool-calling applications, excelling in conversational AI, API integrations, and agentic workflows requiring near-Grok 4 capabilities at reduced cost. It supports general-purpose tasks like query response, factual answering, and tool use (e.g., code execution, web search) on platforms like grok.com, x.com, and mobile apps. Its efficiency makes it ideal for high-throughput scenarios such as real-time chat, content generation, and lightweight automation.Out of scope use cases
The model is not suited for high-risk, mission-critical applications without additional safeguards, such as unrestricted dual-use research (e.g., advanced CBRN planning) or unfiltered adversarial testing. It may underperform in extremely long-context tasks or non-English languages due to its generalist training. Prohibited uses include generating harmful, illegal, or disallowed content (e.g., CSAM, violent crimes), as outlined in xAI's acceptable use policy.Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Technical specs
The provider has not supplied this information.Training cut-off date
The provider has not supplied this information.Training time
The provider has not supplied this information.Input formats
Preferred input is structured text prompts, including natural language queries or tool-use instructions. Example: Explain the steps to solve a quadratic equation. The model expects clear, intent-explicit prompts for optimal performance, as detailed in xAI's system prompts repository.Output formats
The provider has not supplied this information.Supported languages
EnglishSample JSON response
The provider has not supplied this information.Model architecture
The provider has not supplied this information.Long context
Context length supports general conversational and tool-use workflows, though exact token limit not disclosed. Performance excels in single-session reasoning, reducing complexity for iterative queries compared to larger models.Optimizing model performance
The provider has not supplied this information.Additional assets
The provider has not supplied this information.Training disclosure
Training, testing and validation
The training dataset comprises a general purpose pre-training corpus (publicly available Internet data, third-party data for xAI, user/contractor data, internally generated data) with filtering for quality and safety (e.g., de-duplication, classification). Post-training used reinforcement learning (human feedback, verifiable rewards, model grading) and supervised fine-tuning on tasks, tool use, and refusal demonstrations. Testing and validation used internal benchmarks (e.g., refusal datasets, AgentHarm, MASK) and human evaluations. No public data summary is available.Distribution
Distribution channels
The provider has not supplied this information.More information
The provider has not supplied this information.Responsible AI considerations
Safety techniques
Post-training alignment focused on safety, including refusals for harmful requests (e.g., CBRN or cyber weapons, self-harm, CSAM) and robustness to adversarial inputs like jailbreaks. Techniques included supervised fine-tuning on demonstrations of correct refusal behaviors, reinforcement learning for policy adherence, and system prompt injections for honesty and political objectivity. Human and automated evaluations ensured reduced deception, bias, and misuse risks, with emphasis on agentic tool-calling safeguards. Safety objectives targeted compliance with xAI's policy, preventing foreseeable harm while allowing non-malicious queries. It was pre-trained on a general purpose data corpus and post-trained on various tasks and tool use, with demonstrations of correct refusal behaviors according to xAI's default safety policy. Deployed in the xAI API with a fixed system prompt prefix that reminds the model of the safety policy, plus input filters to safeguard against abuse. The training dataset comprises a general purpose pre-training corpus (publicly available Internet data, third-party data for xAI, user/contractor data, internally generated data) with filtering for quality and safety (e.g., de-duplication, classification). Post-training used reinforcement learning (human feedback, verifiable rewards, model grading) and supervised fine-tuning on tasks, tool use, and refusal demonstrations.Safety evaluations
Safety evaluations included automated tests and human reviews to assess abuse potential (refusals, agentic harm, hijacking), concerning propensities (deception, sycophancy, bias), and dual-use capabilities (CBRN/cyber knowledge, persuasiveness). Red-teaming focused on jailbreaks, prompt injections, and policy circumvention, with mitigations like system prompts reducing attack success rates to near-zero. Collaboration with internal teams refined safeguards for API deployment. No public details on specific risk categories beyond reported metrics were disclosed. Grok 4 Fast scored low on abuse metrics (e.g., 0.00 answer rate on refusals, 0.08 on AgentHarm) and propensities (e.g., 0.47 dishonesty rate on MASK, 0.10 sycophancy rate), with dual-use capabilities below Grok 4 (e.g., 85.2% on WMDP Bio, 30.0% on CyBench). Human evaluations focused on safety robustness and truthfulness, complementing benchmarks like WMDP and AgentDojo. Evaluations cover abuse potential, concerning propensities, and dual-use capabilities, all conducted on a near-final release checkpoint.Known limitations
The model may exhibit residual risks in dual-use scenarios or adversarial prompts, requiring user verification for sensitive applications. It is optimized for English and general queries, potentially underperforming in niche or biased contexts. Risks include unintended deception or bias, mitigated by system prompts and input filters. Limitations include higher dishonesty without reasoning (0.63 rate), mitigated by enabling reasoning and honesty prompts. The model is not suited for high-risk, mission-critical applications without additional safeguards, such as unrestricted dual-use research (e.g., advanced CBRN planning) or unfiltered adversarial testing. It may underperform in extremely long-context tasks or non-English languages due to its generalist training.Acceptable use
Acceptable use policy
Developers must comply with xAI's acceptable use policy, avoiding harmful outputs. For high-risk use cases, implement robust monitoring, truthfulness instructions, and human oversight to ensure reliability. Prohibited uses include generating harmful, illegal, or disallowed content (e.g., CSAM, violent crimes), as outlined in xAI's acceptable use policy.Quality and performance evaluations
Source: xAI Grok 4 Fast scored low on abuse metrics (e.g., 0.00 answer rate on refusals, 0.08 on AgentHarm) and propensities (e.g., 0.47 dishonesty rate on MASK, 0.10 sycophancy rate), with dual-use capabilities below Grok 4 (e.g., 85.2% on WMDP Bio, 30.0% on CyBench). Human evaluations focused on safety robustness and truthfulness, complementing benchmarks like WMDP and AgentDojo. Limitations include higher dishonesty without reasoning (0.63 rate), mitigated by enabling reasoning and honesty prompts. The model's efficiency outperforms rivals in latency-sensitive tasks. Safety evaluations included automated tests and human reviews to assess abuse potential (refusals, agentic harm, hijacking), concerning propensities (deception, sycophancy, bias), and dual-use capabilities (CBRN/cyber knowledge, persuasiveness). Red-teaming focused on jailbreaks, prompt injections, and policy circumvention, with mitigations like system prompts reducing attack success rates to near-zero. Collaboration with internal teams refined safeguards for API deployment. No public details on specific risk categories beyond reported metrics were disclosed.Benchmarking methodology
Source: xAI Benchmarking used standardized prompts (e.g., refusal datasets, WMDP) for fair comparison. Human evaluations supplemented quantitative metrics, focusing on safety and robustness. No prompt adaptations were allowed to ensure consistency. Further details on methodology are not publicly available. Testing and validation used internal benchmarks (e.g., refusal datasets, AgentHarm, MASK) and human evaluations. No public data summary is available.Public data summary
Source: xAI The provider has not supplied this information.Model Specifications
Context Length2000000
Quality Index0.89
LicenseCustom
Training DataSeptember 2025
Last UpdatedDecember 2025
Input TypeText,Image
Output TypeText
ProviderxAI
Languages1 Language
Related Models