DeepSeek-V3-0324
DeepSeek-V3-0324
Version: 1
DeepSeekLast updated December 2025
DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.
Coding
Agents

Direct from Azure models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:
  • Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
  • Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
  • Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
  • Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Learn more about Direct from Azure models .

Key capabilities

About this model

DeepSeek-V3-0324 shows significant improvements over its predecessor, DeepSeek-V3, in several key aspects.

Key model capabilities

Reasoning Capabilities
  • Significant improvements in benchmark performance:
    • MMLU-Pro: 75.9 ? 81.2 (+5.3)
    • GPQA: 59.1 ? 68.4 (+9.3)
    • AIME: 39.6 ? 59.4 (+19.8)
    • LiveCodeBench: 39.2 ? 49.2 (+10.0)
  • Front-End Web Development
    • Improved the executability of the code
    • More aesthetically pleasing web pages and game front-ends
  • Chinese Writing Proficiency
    • Enhanced style and content quality:
      • Aligned with the R1 writing style
      • Better quality in medium-to-long-form writing
    • Feature Enhancements
      • Improved multi-turn interactive rewriting
      • Optimized translation quality and letter writing
  • Chinese Search Capabilities
    • Enhanced report analysis requests with more detailed outputs
  • Function Calling Improvements
    • Increased accuracy in Function Calling, fixing issues from previous V3 versions

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

The provider has not supplied this information.

Out of scope use cases

Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks. We recommend customers use Azure AI Content Safety in conjunction with this model and conduct their own evaluations on production systems. When deployed via Microsoft Foundry, prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

DeepSeek-V3-0324 is a Mixture-of-Experts (MoE) language model with 671 billion total parameters, with 37 billion activated for each token.

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

The provider has not supplied this information.

Output formats

The provider has not supplied this information.

Supported languages

The provider has not supplied this information.

Sample JSON response

The provider has not supplied this information.

Model architecture

It adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Additionally, DeepSeek-V3-0324 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for enhanced performance.

Long context

The provider has not supplied this information.

Optimizing model performance

The provider has not supplied this information.

Additional assets

Learn more: [original model announcement ]

Training disclosure

Training, testing and validation

The provider has not supplied this information.

Distribution

Distribution channels

The provider has not supplied this information.

More information

The provider has not supplied this information.

Responsible AI considerations

Safety techniques

Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks. We recommend customers use Azure AI Content Safety in conjunction with this model and conduct their own evaluations on production systems. When deployed via Microsoft Foundry, prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: DeepSeek Significant improvements in benchmark performance:
  • MMLU-Pro: 75.9 → 81.2 (+5.3)
  • GPQA: 59.1 → 68.4 (+9.3)
  • AIME: 39.6 → 59.4 (+19.8)
  • LiveCodeBench: 39.2 → 49.2 (+10.0)

Benchmarking methodology

Source: DeepSeek The provider has not supplied this information.

Public data summary

Source: DeepSeek Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks.
Model Specifications
Context Length128000
Quality Index0.78
LicenseMit
Last UpdatedDecember 2025
Input TypeText
Output TypeText
ProviderDeepSeek
Languages2 Languages