Claude Opus 4.1
Version: 20250805
Models from Partners and Community
These models constitute the vast majority of the Azure AI Foundry Models and are provided by trusted third-party organizations, partners, research labs, and community contributors. These models offer specialized and diverse AI capabilities, covering a wide array of scenarios, industries, and innovations. An example of models from Partners and community are the family of large language models developed by Anthropic. Anthropic includes Claude family of state-of-the-art large language models that support text and image input, text output, multilingual capabilities, and vision. See Anthropic's privacy policy to know more about privacy. Learn how to deploy Anthropic models . Characteristics of Models from Partners and Community:- Developed and supported by external partners and community contributors.
- Diverse range of specialized models catering to niche or broad use cases.
- Typically validated by providers themselves, with integration guidelines provided by Azure.
- Community-driven innovation and rapid availability of cutting-edge models.
- Standard Azure AI integration, with support and maintenance managed by the respective providers.
Key capabilities
About this model
Claude Opus 4.1 is an industry leader for coding. It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve.Key model capabilities
- Extended thinking: Extended thinking gives Claude enhanced reasoning capabilities for complex tasks.
- Image & text input: With state of the art vision capabilities, Claude Sonnet 4.5 can process images and return text outputs to analyze and understand charts, graphs, technical diagrams, reports, and other visual assets.
Use cases
See Responsible AI for additional consideration for responsible use.Key use cases
Claude Opus 4.1 is an industry leader for coding and agent capabilities, especially agentic search. It excels for customers needing frontier intelligence:- Advanced coding: Independently plan and execute complex development tasks end-to-end. It adapts to your style, thoughtfully plans and pivots, and maintains high code quality throughout.
- Long-horizon tasks and complex problem solving (virtual collaborator): Unlock new use cases involving long-horizon tasks that require memory, sustained reasoning, and long chains of actions.
- AI agents: Enable agents to tackle complex, multi-step tasks that require peak accuracy.
- Agentic search and research: Connect to multiple data sources to synthesize comprehensive insights across repositories.
- Content creation: Create human-quality content with natural prose. Produce long-form creative content, technical documentation, marketing copy, and frontend design mockups.
- Memory and context management: Incorporates memory capabilities that allow it to effectively summarize and reference previous interactions.
Out of scope use cases
Please refer to the Claude Opus 4.1 system card .Pricing
Pricing is based on a number of factors. See pricing details here .Technical specs
Please refer to the Claude Opus 4.1 system card .Training cut-off date
March 2025Input formats
Image & text input: With powerful vision capabilities, Claude Opus 4.1 can process images and return text outputs to analyze and understand charts, graphs, technical diagrams, reports, and other visual assets. Text output: Claude Opus 4.1 can output text of a variety of types and formats, such as prose, lists, Markdown tables, JSON, HTML, code in various programming languages, and more.Supported language
Claude Opus 4.1 can understand and output a wide variety of languages, such as French, Standard Arabic, Mandarin Chinese, Japanese, Korean, Spanish, and Hindi. Performance will vary based on how well-resourced the language is.Sample JSON response
200:
{
"content": [
{
"text": "Hi! My name is Claude.",
"type": "text"
}
],
"id": "msg_313Zva2CMHLNnXjNJJKqJ2EH",
"model": "claude-opus-4-1-20250805",
"role": "assistant",
"stop_reason": "end_turn",
"stop_sequence": null,
"type": "message",
"usage": {
"input_tokens": 31,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0,
"cache_creation": { "ephemeral_5m_input_tokens": 0, "ephemeral_1h_input_tokens": 0 },
"output_tokens": 25,
"service_tier": "standard",
}
}
4XX:
{
"error": {
"message": "Invalid request",
"type": "invalid_request_error"
},
"request_id": "<string>",
"type": "error"
}
Model architecture
Please refer to the Claude Opus 4.1 system card .Long context
Claude Opus 4.1 has a 200K token context window.Optimizing model performance
Please refer to the Claude Opus 4.1 system card .Additional assets
- Claude Documentation : Visit Anthropic's Claude documentation for a wealth of resources on model capabilities, prompting techniques, use case guidelines, and more.
- Extended Thinking Guide : Understand how best to use extended thinking with Claude.
- Claude Prompting Resources : Check out Anthropic's prompting tools and guides to learn how to craft prompts that elicit more helpful, nuanced responses.
- Claude Cookbooks : Check out example code for a variety of complex tasks, such as RAG from various web sources, making SQL queries, function calling, multimodal prompting, and more.
Distribution channels
- Claude API: For developers interested in building agents, Opus 4.1 is available on the Claude Developer Platform.
- Claude Code: Use Opus 4.1 with Anthropic's industry-leading coding agent, Claude Code.
More information
Data handling
By default, we may process customer data in select countries in the US, Europe, Asia and Australia. We will only store data in data centers located in the United States. For more on data handling and retention, see our Privacy Center.By default, we will not use your inputs or outputs from our commercial products (Anthropic API and Claude Code Enterprise) to train our models. If you explicitly report feedback or bugs to us or otherwise choose to allow us to use your data, then we may use your chats and coding sessions to train our models.
To find out more information regarding your use of an Anthropic commercial offering, or if you would like to know how to contact us regarding a privacy related topic, see our Trust Center and Commercial Terms.
Responsible AI considerations
Safety techniques
The Claude Opus 4.1 system card describes in detail the evaluations Anthropic ran to assess the model's safety and alignment.Safety evaluations
Claude Opus 4.1 represents incremental improvements over Claude Opus 4, with enhancements in reasoning quality, instruction-following, and overall performance. The Claude Opus 4.1 system card includes details of safety evaluations, including safeguards, agentic safety, alignment and welfare assessments, and reward hacking. The Claude Opus 4 system card includes details of a wide range of pre-deployment safety tests conducted in line with the commitments in our Responsible Scaling Policy; tests of the model's behavior around violations of our Usage Policy; evaluations of specific risks such as “reward hacking” behavior; and agentic safety evaluations for computer use and coding capabilities. In addition, it includes a detailed alignment assessment covering a wide range of misalignment risks identified in our research, and a model welfare assessment.Known limitations
Please refer to the Claude Opus 4.1 system card and the Claude Opus 4 system card .Acceptable use
Acceptable use policy
Anthropic's Usage Policy is intended to help our users stay safe and promote the responsible use of our products and services.Quality and performance evaluations
| Benchmark | Test Name | Opus 4.1 Score |
|---|---|---|
| Agentic coding | SWE-bench Verified | 74.5% / 79.4% with parallel test-time compute |
| Agentic terminal coding | Terminal-bench | 46.5% |
| Agentic tool use | t2-bench | Retail 86.8%, Airline 63.0%, Telecom 71.5% |
| Computer use | OSWorld | 44.4% |
| High school math competition | AIME 2025 | 78.0% |
| Graduate-level reasoning | GPQA Diamond | 81.0% |
| Multilingual Q&A | MMLU | 89.5% |
| Visual reasoning | MMMU (validation) | 77.1% |
| Financial analysis | Finance Agent | 50.9% |
Benchmarking methodology
Claude models are hybrid reasoning models. The benchmarks reported in this blog post show the highest scores achieved with or without extended thinking. We've noted below for each result whether extended thinking was used:- No extended thinking: SWE-bench Verified, Terminal-bench
- The following benchmarks were reported with extended thinking (up to 64K tokens): TAU-bench, GPQA Diamond, MMMLU, MMMU, and AIME.
Public data summary
N/AModel Specifications
Context Length200000
Quality Index0.90
Training DataMarch 2025
Last UpdatedNovember 2025
Input TypeText,Image,Code
Output TypeText
ProviderAnthropic
Languages8 Languages
Related Models