DeepSeek-V3-0324
Version: 1
Direct from Azure models
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Key capabilities
About this model
DeepSeek-V3-0324 shows significant improvements over its predecessor, DeepSeek-V3, in several key aspects.Key model capabilities
Reasoning Capabilities-
Significant improvements in benchmark performance:
- MMLU-Pro: 75.9 ? 81.2 (+5.3)
- GPQA: 59.1 ? 68.4 (+9.3)
- AIME: 39.6 ? 59.4 (+19.8)
- LiveCodeBench: 39.2 ? 49.2 (+10.0)
-
Front-End Web Development
- Improved the executability of the code
- More aesthetically pleasing web pages and game front-ends
-
Chinese Writing Proficiency
- Enhanced style and content quality:
- Aligned with the R1 writing style
- Better quality in medium-to-long-form writing
- Feature Enhancements
- Improved multi-turn interactive rewriting
- Optimized translation quality and letter writing
- Enhanced style and content quality:
-
Chinese Search Capabilities
- Enhanced report analysis requests with more detailed outputs
-
Function Calling Improvements
- Increased accuracy in Function Calling, fixing issues from previous V3 versions
Use cases
See Responsible AI for additional considerations for responsible use.Key use cases
The provider has not supplied this information.Out of scope use cases
Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks. We recommend customers use Azure AI Content Safety in conjunction with this model and conduct their own evaluations on production systems. When deployed via Microsoft Foundry, prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Technical specs
DeepSeek-V3-0324 is a Mixture-of-Experts (MoE) language model with 671 billion total parameters, with 37 billion activated for each token.Training cut-off date
The provider has not supplied this information.Training time
The provider has not supplied this information.Input formats
The provider has not supplied this information.Output formats
The provider has not supplied this information.Supported languages
The provider has not supplied this information.Sample JSON response
The provider has not supplied this information.Model architecture
It adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Additionally, DeepSeek-V3-0324 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for enhanced performance.Long context
The provider has not supplied this information.Optimizing model performance
The provider has not supplied this information.Additional assets
Learn more: [original model announcement ]Training disclosure
Training, testing and validation
The provider has not supplied this information.Distribution
Distribution channels
The provider has not supplied this information.More information
The provider has not supplied this information.Responsible AI considerations
Safety techniques
Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks. We recommend customers use Azure AI Content Safety in conjunction with this model and conduct their own evaluations on production systems. When deployed via Microsoft Foundry, prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .Safety evaluations
The provider has not supplied this information.Known limitations
The provider has not supplied this information.Acceptable use
Acceptable use policy
The provider has not supplied this information.Quality and performance evaluations
Source: DeepSeek Significant improvements in benchmark performance:- MMLU-Pro: 75.9 → 81.2 (+5.3)
- GPQA: 59.1 → 68.4 (+9.3)
- AIME: 39.6 → 59.4 (+19.8)
- LiveCodeBench: 39.2 → 49.2 (+10.0)
Benchmarking methodology
Source: DeepSeek The provider has not supplied this information.Public data summary
Source: DeepSeek Microsoft and external researchers have found Deepseek V3 to be less aligned than other models -- meaning the model appears to have undergone less refinement designed to make its behavior and outputs more safe and appropriate for users -- resulting in (i) higher risks that the model will produce potentially harmful content and (ii) lower scores on safety and jailbreak benchmarks.Model Specifications
Context Length128000
Quality Index0.78
LicenseMit
Last UpdatedDecember 2025
Input TypeText
Output TypeText
ProviderDeepSeek
Languages2 Languages
Related Models