Grok Code Fast 1

Version: 1

xAI•Last updated October 2025

Grok Code Fast 1 is a fast, economical AI model for agentic coding, built from scratch with a new architecture, trained on programming-rich data, and fine-tuned for real-world coding tasks like bug fixes and project setup.

Coding

Agents

Low latency

Azure Direct Models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:

Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Azure AI Foundry platform.
Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Azure AI Foundry; reducing integration effort.
Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.

Learn more about Direct from Azure models .

Key capabilities

About this model

Post-training focused on aligning the model for practical coding tasks, with human evaluations by developers ensuring usability. The model excels in languages like TypeScript, Python, Java, Rust, C++, and Go, and supports structured outputs and function calling for seamless integration with development tools. It differs from larger models like Grok 4 by prioritizing speed and cost over broad reasoning capabilities.

Key model capabilities

The model excels in languages like TypeScript, Python, Java, Rust, C++, and Go, and supports structured outputs and function calling for seamless integration with development tools. It prioritizes low-latency responses and tool integration (e.g., grep, terminal, file editing), making it ideal for iterative coding workflows in IDEs like GitHub Copilot and Cursor. Grok Code Fast 1 is designed for agentic coding tasks, excelling in rapid prototyping, bug fixing, and navigating large codebases with minimal oversight. Its speed and low-cost API make it ideal for high-throughput tasks like CI automation and batch code generation.

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

Grok Code Fast 1 is designed for agentic coding tasks, excelling in rapid prototyping, bug fixing, and navigating large codebases with minimal oversight. It integrates seamlessly with IDEs like GitHub Copilot and Cursor, supporting developers in tasks like code snippet generation, project setup, and automated edits in TypeScript, Python, Java, Rust, C++, and Go. Its speed and low-cost API make it ideal for high-throughput tasks like CI automation and batch code generation.

Out of scope use cases

The model is not suited for complex, mission-critical projects requiring extensive reasoning or multimodal inputs beyond text. It may underperform in non-coding tasks or non-English languages due to its coding-focused training. Prohibited uses include generating harmful, illegal, or copyrighted content, as outlined in xAI's acceptable use policy.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

It uses a new lightweight transformer-based architecture optimized for speed and cost-efficiency. The model's speed (up to 160 tokens/second) outperforms rivals like Claude Sonnet in coding efficiency. It excels in coding accuracy (93.0%) and instruction following (75.0%), with 100% reliability across seven benchmarks.

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

Preferred input is structured text prompts, including code snippets or natural language instructions. Example:

Write a Python function to calculate Fibonacci numbers up to n.
The model expects clear, task-specific prompts for optimal performance, as detailed in xAI's Prompt Engineering Guide.

Output formats

The model supports structured outputs and function calling for seamless integration with development tools.

Supported languages

English

Sample JSON response

The provider has not supplied this information.

Model architecture

It uses a new lightweight transformer-based architecture optimized for speed and cost-efficiency.

Long context

The 256,000-token context window supports large codebases, enabling tasks like repository-wide refactors and multi-file edits. Compared to GPT-4o (128,000 tokens), it handles larger contexts but trails models with 1M-token windows. Performance excels in single-session codebase reasoning, reducing retrieval complexity.

Optimizing model performance

The provider has not supplied this information.

Additional assets

The provider has not supplied this information.

Training disclosure

Training, testing and validation

The training dataset comprises a large pre-training corpus of programming-related content (e.g., open-source code, documentation) and post-training datasets of real-world pull requests and coding tasks. Sources include public code repositories and curated synthetic data, with no user data or private third-party data disclosed. The dataset scale is not specified, but it emphasizes diversity in programming languages and tasks. Testing and validation used internal benchmarks and human evaluations by developers. No public data summary is available.

Distribution

Distribution channels

The provider has not supplied this information.

More information

The provider has not supplied this information.

Responsible AI considerations

Safety techniques

Post-training alignment used high-quality datasets reflecting real-world coding tasks, such as pull requests and bug fixes, to enhance practical utility. Safety alignment targeted reliability and usability, with human evaluations by experienced developers to refine behavior in agentic workflows. Techniques included supervised fine-tuning and reinforcement learning to ensure accurate code generation and tool use, with a focus on minimizing errors in iterative coding scenarios. Safety objectives included preventing disallowed content (e.g., harmful or copyrighted code) and ensuring compliance with developer workflows. The model may produce errors in complex coding scenarios, requiring developer verification for critical applications. It is optimized for English and major programming languages, potentially underperforming in niche or non-English contexts. Risks include generating incomplete or incorrect code, mitigated by encouraging small, focused prompts and human oversight. Developers must comply with xAI's acceptable use policy, avoiding harmful or illegal outputs. For high-risk use cases, implement robust testing and validation to ensure reliability.

Safety evaluations

Safety evaluations included automated tests and human reviews to assess disallowed content (e.g., sexual, violent, or copyrighted material) and jailbreak risks. Collaboration with launch partners like GitHub Copilot refined tool-use safety. Red-teaming focused on coding-specific risks, ensuring compliance with developer workflows. No public details on specific risk categories or outcomes were disclosed.

Known limitations

Grok Code Fast 1 scored 70.8% on SWE-Bench Verified (internal harness), competitive with smaller models like GPT-5-nano but trailing larger models in complex reasoning. Limitations include reduced accuracy in complex tasks, mitigated by encouraging iterative prompting. The model is not suited for complex, mission-critical projects requiring extensive reasoning or multimodal inputs beyond text. It may underperform in non-coding tasks or non-English languages due to its coding-focused training.

Acceptable use

Acceptable use policy

Prohibited uses include generating harmful, illegal, or copyrighted content, as outlined in xAI's acceptable use policy.

Quality and performance evaluations

Source: xAI Grok Code Fast 1 scored 70.8% on SWE-Bench Verified (internal harness), competitive with smaller models like GPT-5-nano but trailing larger models in complex reasoning. It excels in coding accuracy (93.0%) and instruction following (75.0%), with 100% reliability across seven benchmarks. Human evaluations prioritized developer experience in agentic workflows, complementing benchmarks like SWE-Bench. Limitations include reduced accuracy in complex tasks, mitigated by encouraging iterative prompting. The model's speed (up to 160 tokens/second) outperforms rivals like Claude Sonnet in coding efficiency.

Benchmarking methodology

Source: xAI Benchmarking used SWE-Bench Verified with standardized prompts for fair comparison. Human evaluations supplemented quantitative metrics, focusing on real-world coding tasks. No prompt adaptations were allowed to ensure consistency. Further details on methodology are not publicly available. Post-training alignment used high-quality datasets reflecting real-world coding tasks, such as pull requests and bug fixes, to enhance practical utility. Safety alignment targeted reliability and usability, with human evaluations by experienced developers to refine behavior in agentic workflows. Techniques included supervised fine-tuning and reinforcement learning to ensure accurate code generation and tool use, with a focus on minimizing errors in iterative coding scenarios.