Claude Sonnet 4.5
Claude Sonnet 4.5
Version: 20250929
AnthropicLast updated November 2025
Claude Sonnet 4.5 is Anthropic's most capable model for complex agents and an industry leader for coding and computer use.
Reasoning

Models from Partners and Community

These models constitute the vast majority of the Azure AI Foundry Models and are provided by trusted third-party organizations, partners, research labs, and community contributors. These models offer specialized and diverse AI capabilities, covering a wide array of scenarios, industries, and innovations. An example of models from Partners and community are the family of large language models developed by Anthropic. Anthropic includes Claude family of state-of-the-art large language models that support text and image input, text output, multilingual capabilities, and vision. See Anthropic's privacy policy to know more about privacy. Learn how to deploy Anthropic models . Characteristics of Models from Partners and Community:
  • Developed and supported by external partners and community contributors.
  • Diverse range of specialized models catering to niche or broad use cases.
  • Typically validated by providers themselves, with integration guidelines provided by Azure.
  • Community-driven innovation and rapid availability of cutting-edge models.
  • Standard Azure AI integration, with support and maintenance managed by the respective providers.
Models from Partners and Community are deployable as Managed Compute or serverless API deployment options. The model provider selects how the models are deployable.

Key capabilities

About this model

Claude Sonnet 4.5 is Anthropic's most capable model to date for building real-world agents and handling complex, long-horizon tasks, balancing the right speed and cost for high-volume use cases.

Key model capabilities

  • Extended thinking: Extended thinking gives Claude enhanced reasoning capabilities for complex tasks.
  • Image & text input: With strong vision capabilities, Claude Sonnet 4.5 can process images and return text outputs to analyze and understand charts, graphs, technical diagrams, reports, and other visual assets.
  • Computer use: Claude Sonnet 4.5 is Anthropic's most accurate model for computer use, enabling developers to direct Claude to use computers the way people do.

Use cases

See Responsible AI section for additional consideration for responsible use.

Key use cases

Claude Sonnet 4.5 is Anthropic's most capable model to date for building real-world agents and handling complex, long-horizon tasks–balancing the right speed and cost for high-volume use cases:
  • Long-running agents: Power production-ready assistants for multi-step, real-time applications—from customer support automation to complex operational workflows that require peak accuracy, intelligence, and speed.
  • Coding: Handle everyday development tasks with enhanced performance––or plan and execute complex software projects spanning hours or days––with the ability to save, maintain, and reference information across multiple sessions.
  • Cybersecurity: Deploy agents that autonomously patch vulnerabilities before exploitation––shifting from reactive detection to proactive defense.
  • Financial analysis: Conduct entry-level financial analysis, deliver advanced predictive analysis, or preemptively develop intelligent risk management strategies that leverage best-in-class domain knowledge.
  • Computer use: Claude Sonnet 4.5 is Anthropic's most accurate model for computer use, enabling developers to direct Claude to use computers the way people do.
  • Research: Perform focused analysis across multiple data sources, turning expert analysis into final deliverables. Ideal for complex problem solving, rapid business intelligence, and real-time decision support.

Out of scope use cases

Please refer to the Claude Sonnet 4.5 system card .

Pricing

Pricing is based on a number of factors. See pricing details here .

Technical specs

Please refer to the Claude Sonnet 4.5 system card .

Training cut-off date

July 2025

Input formats

Image & text input: With state of the art vision capabilities, Claude Sonnet 4.5 can process images and return text outputs to analyze and understand charts, graphs, technical diagrams, reports, and other visual assets. Text output: Claude Sonnet 4.5 can output text of a variety of types and formats, such as prose, lists, Markdown tables, JSON, HTML, code in various programming languages, and more.

Supported language

Claude Sonnet 4.5 can understand and output a wide variety of languages, such as French, Standard Arabic, Mandarin Chinese, Japanese, Korean, Spanish, and Hindi. Performance will vary based on how well-resourced the language is.

Sample JSON response

200:
{
  "content": [
    {
      "text": "Hi! My name is Claude.",
      "type": "text"
    }
  ],
  "id": "msg_013Zva2CMHLNnXjNJJKqJ2EF",
  "model": "claude-sonnet-4-5-20250929",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 31,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "cache_creation": { "ephemeral_5m_input_tokens": 0, "ephemeral_1h_input_tokens": 0 },
    "output_tokens": 25,
    "service_tier": "standard",
    
  }
}
4XX:
{
  "error": {
    "message": "Invalid request",
    "type": "invalid_request_error"
  },
  "request_id": "<string>",
  "type": "error"
}

Model architecture

Please refer to the Claude Sonnet 4.5 system card .

Long context

Claude Sonnet 4.5 has a 200K token context window, and supports up to 1M tokens in public beta. With longer context, developers can run more comprehensive and data-intensive use cases with Claude, including: large-scale code analysis, document synthesis, and context-aware agents.

Optimizing model performance

Please refer to the Claude Sonnet 4.5 system card .

Additional assets

  • Claude Documentation : Visit Anthropic's Claude documentation for a wealth of resources on model capabilities, prompting techniques, use case guidelines, and more.
  • Extended Thinking Guide : Understand how best to use extended thinking with Claude.
  • Claude Prompting Resources : Check out Anthropic's prompting tools and guides to learn how to craft prompts that elicit more helpful, nuanced responses.
  • Claude Cookbooks : Check out example code for a variety of complex tasks, such as RAG from various web sources, making SQL queries, function calling, multimodal prompting, and more.

Distribution channels

  • Claude API: For developers interested in building agents, Sonnet 4.5 is available on the Claude Developer Platform.
  • Claude Code: Use Sonnet 4.5 with Anthropic's industry-leading coding agent, Claude Code.

More information

Data handling

By default, we may process customer data in select countries in the US, Europe, Asia and Australia. We will only store data in data centers located in the United States. For more on data handling and retention, see our Privacy Center.
By default, we will not use your inputs or outputs from our commercial products (Anthropic API and Claude Code Enterprise) to train our models. If you explicitly report feedback or bugs to us or otherwise choose to allow us to use your data, then we may use your chats and coding sessions to train our models.
To find out more information regarding your use of an Anthropic commercial offering, or if you would like to know how to contact us regarding a privacy related topic, see our Trust Center and Commercial Terms.

Responsible AI considerations

Safety techniques

The Claude Sonnet 4.5 system card describes in detail the wide range of evaluations Anthropic ran to assess the model's safety and alignment.

Safety evaluations

Claude Sonnet 4.5 has a substantially improved safety profile compared to previous Claude models. The Claude Sonnet 4.5 system card includes tests related to model safeguards; assessments of safety in agentic situations where the model is working autonomously; cybersecurity evaluations; a detailed alignment assessment including stress-testing of the model in unusual and extreme scenarios; evaluations of model honesty and reward-hacking behavior; a tentative investigation of model welfare concerns; and a set of analyses mandated by our Responsible Scaling Policy on risks for the production of dangerous weapons and autonomous AI research & development.

Known limitations

Please refer to the Claude Sonnet 4.5 system card .

Acceptable use

Acceptable use policy

Anthropic's Usage Policy is intended to help our users stay safe and promote the responsible use of our products and services.

Quality and performance evaluations

Claude Sonnet 4.5 is the best coding model in the world, ideal for powering complex agents, computer use, and logic-heavy tasks.
BenchmarkTest NameSonnet 4.5 Score
Agentic codingSWE-bench Verified77.2% / 82.0% with parallel test-time compute
Agentic terminal codingTerminal-bench50.0%
Agentic tool uset2-benchRetail 86.2%, Airline 70.0%, Telecom 98.0%
Computer useOSWorld61.4%
High school math competitionAIME 202587.0% (no tools), 100% (python)
Graduate-level reasoningGPQA Diamond83.4%
Multilingual Q&AMMMLU89.1%
Visual reasoningMMMU (validation)77.8%
Financial analysisFinance Agent55.3%

Benchmarking methodology

SWE-bench Verified: All Claude results were reported using a simple scaffold with two tools—bash and file editing via string replacements. We report 77.2%, which was averaged over 10 trials, no test-time compute, and 200K thinking budget on the full 500-problem SWE-bench Verified dataset. The score reported uses a minor prompt addition: "You should use tools as much as possible, ideally more than 100 times. You should also implement your own tests first before attempting the problem." A 1M context configuration achieves 78.2%, but we report the 200K result as our primary score as the 1M configuration was implicated in our recent inference issues . For our "high compute" numbers we adopt additional complexity and parallel test-time compute as follows:
  • We sample multiple parallel attempts.
  • We discard patches that break the visible regression tests in the repository, similar to the rejection sampling approach adopted by Agentless (Xia et al. 2024); note no hidden test information is used.
  • We then use an internal scoring model to select the best candidate from the remaining attempts.
  • This results in a score of 82.0% for Sonnet 4.5.
Terminal-Bench: All scores reported use the default agent framework (Terminus 2), with XML parser, averaging multiple runs during different days to smoothen the eval sensitivity to inference infra. τ2-bench: Scores were achieved using extended thinking with tool use and a prompt addendum to the Airline and Telecom Agent Policy instructing Claude to better target its known failure modes when using the vanilla prompt. A prompt addendum was also added to the Telecom User prompt to avoid failure modes from the user ending the interaction incorrectly. AIME: Sonnet 4.5 score reported using sampling at temperature 1.0. The model used 64K reasoning tokens for the Python configuration. OSWorld: All scores reported use the official OSWorld-Verified framework with 100 max steps, averaged across 4 runs. MMMLU: All scores reported are the average of 5 runs over 14 non-English languages with extended thinking (up to 128K). Finance Agent: All scores reported were run and published by Vals AI on their public leaderboard. All Claude model results reported are with extended thinking (up to 64K) and Sonnet 4.5 is reported with interleaved thinking on.

Public data summary

N/A
Model Specifications
Context Length200000
Quality Index0.92
Training DataJuly 2025
Last UpdatedNovember 2025
Input TypeText,Image,Code
Output TypeText
ProviderAnthropic
Languages8 Languages
Related Models