DeepSeek-R1-0528

DeepSeek-R1-0528

The DeepSeek R1 0528 model has improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.
DeepSeek
Direct from Azure
Version: 1

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:

  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.

Learn more about Direct from Azure models .

About this model

Compared to the previous version, the upgraded model shows significant improvements in handling complex reasoning tasks. For instance, in the AIME 2025 test, the model's accuracy has increased from 70% in the previous version to 87.5% in the current version. This advancement stems from enhanced thinking depth during the reasoning process: in the AIME test set, the previous model used an average of 12K tokens per question, whereas the new version averages 23K tokens per question.

Key model capabilities

Beyond its improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

CategoryBenchmark (Metric)DeepSeek R1DeepSeek R1 0528
General
MMLU-Redux (EM)92.993.4
MMLU-Pro (EM)84.085.0
GPQA-Diamond (Pass@1)71.581.0
SimpleQA (Correct)30.127.8
FRAMES (Acc.)82.583.0
Humanity's Last Exam (Pass@1)8.517.7
Code
LiveCodeBench (2408-2505) (Pass@1)63.573.3
Codeforces-Div1 (Rating)15301930
SWE Verified (Resolved)49.257.6
Aider-Polyglot (Acc.)53.371.6
Math
AIME 2024 (Pass@1)79.891.4
AIME 2025 (Pass@1)70.087.5
HMMT 2025 (Pass@1)41.779.4
CNMO 2024 (Pass@1)78.886.9
Tools
BFCL_v3_MultiTurn (Acc)-37.0
Tau-Bench (Pass@1)-53.5(Airline)/63.9(Retail)

Note: We use Agentless framework to evaluate model performance on SWE-Verified. We only evaluate text-only prompts in HLE testsets. GPT-4.1 is employed to act user role in Tau-bench evaluation.

Quick facts

Model providerDeepSeek
TypeChat completion
LifecycleDeprecated
Input typetext
Output typetext
Context window163.84k
Token limits163.84k output