DeepSeek-V3-0324
DeepSeek-V3-0324
Version: 1
DeepSeekLast updated April 2025
DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including enhanced reasoning, improved function calling, and superior code generation capabilities.
Coding
Agents
Learn more: [original model announcement ]
DeepSeek-V3-0324 is a Mixture-of-Experts (MoE) language model with 671 billion total parameters, with 37 billion activated for each token. It adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Additionally, DeepSeek-V3-0324 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for enhanced performance. Notably, DeepSeek-V3-0324 shows significant improvements over its predecessor, DeepSeek-V3, in several key aspects.
Reasoning Capabilities
  • Significant improvements in benchmark performance:
    • MMLU-Pro: 75.9 → 81.2 (+5.3)
    • GPQA: 59.1 → 68.4 (+9.3)
    • AIME: 39.6 → 59.4 (+19.8)
    • LiveCodeBench: 39.2 → 49.2 (+10.0)
  • Front-End Web Development
    • Improved the executability of the code
    • More aesthetically pleasing web pages and game front-ends
  • Chinese Writing Proficiency
    • Enhanced style and content quality:
      • Aligned with the R1 writing style
      • Better quality in medium-to-long-form writing
    • Feature Enhancements
      • Improved multi-turn interactive rewriting
      • Optimized translation quality and letter writing
  • Chinese Search Capabilities
    • Enhanced report analysis requests with more detailed outputs
  • Function Calling Improvements
    • Increased accuracy in Function Calling, fixing issues from previous V3 versions

Model alignment

Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks. We recommend customers use Azure AI Content Safety in conjunction with this model and conduct their own evaluations on production systems.

Content filtering

When deployed via Azure AI Foundry, prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .
Model Specifications
Context Length128000
Quality Index0.75
LicenseMit
Last UpdatedApril 2025
Input TypeText
Output TypeText
PublisherDeepSeek
Languages2 Languages