Azure-Language-Language-detection

Version: 1

Microsoft•Last updated October 2025

Language detection quickly and accurately identifies the language of any text, supporting over 100 languages and dialects, including the ISO 15924 standard for a select number of languages.

Description

Azure Language

Language detection quickly and accurately identifies the language of any text, supporting over 100 languages and dialects, including the ISO 15924 standard for a select number of languages. It enables developers to build multilingual applications with seamless integration, empowering global communication and personalized user experiences. Azure Language service helps you understand and process text at scale with advanced capabilities like sentiment analysis, entity recognition, summarization, and translation. It empowers businesses to unlock insights, automate workflows, and deliver personalized, multilingual experiences with enterprise-grade security and reliability.

Long context

Distribution channels

Azure Language

Key capabilities

About this model

The Language Detection model in Azure Language automatically identifies the primary language of any text input with high accuracy and speed. It is designed to handle many text types, including short phrases, mixed-language content, and large-scale workloads, making it a powerful tool for developers building multilingual applications. By enabling seamless language identification, it helps businesses deliver personalized, localized experiences globally.

Key model capabilities

Accurate Language Identification: Detects over 100 languages and dialects with high precision, even for short or informal text.
Real-Time Processing: Optimized for low-latency, deterministic detection, supporting real-time applications like chatbots and content routing.
Scalable API Integration: Easily integrates via REST APIs or SDKs, enabling developers to process millions of requests efficiently.
Multilingual Workflow Enablement: Works seamlessly with other Azure Language features like translation, sentiment analysis, and summarization for end-to-end language solutions.

Use cases

See here for additional considerations for responsible use.

Key use cases

Customer Support Automation: Detect language in incoming tickets or chats to route to the right support team or trigger translation.
Content Personalization: Identify user language preferences for delivering localized content and recommendations.
Data Processing Pipelines: Pre-process multilingual datasets for analytics, compliance, or AI model training.
Global Communication Apps: Power chatbots, messaging platforms, and collaboration tools with automatic language detection for seamless user experiences.

Out of scope use cases

The model is not intended for:

Detecting multiple languages within a single sentence or providing language proficiency scoring.
Identifying dialect nuances beyond supported language codes.
Any use that violates Microsoft's Responsible Use of AI , such as discriminatory profiling or harmful content targeting.

Pricing

Pricing is based on the number of text records processed and the selected tier. See the Azure pricing page for more pricing details.

Technical specs

Language Detection is a cloud-based service that uses advanced machine learning models to automatically identify the primary language of any given text input. It supports over 100 languages and dialects, delivering high accuracy even for short phrases or mixed-language content. The model is optimized for speed and scalability, enabling real-time detection through REST APIs or SDKs, and integrates seamlessly with other Azure AI models for multilingual workflows such as translation, sentiment analysis, and content moderation.

Input formats

The Language Detection model expects UTF-8 encoded text as input. You can interact with the model through Foundry portal, REST API (JSON payload), and SDKs (available for .NET, Python, Java, and JavaScript).

Supported language

The Language Detection feature can detect a wide range of languages, variants, dialects, and some regional/cultural languages, and return detected languages with their name and code. The returned language code parameters conform to BCP-47 standard with most of them conforming to ISO-639-1 identifiers. See the full list of supported languages linked here .

Supported Azure regions

See the full list of supported Azure regions for Azure Language linked here .

Sample JSON response

Sample input

{
    "documents": [
        {
            "id": "1",
            "text": "communication"
        },
        {
            "id": "2",
            "text": "communication",
            "countryHint": "fr"
        }
    ]
}

Sample output

{
    "documents":[
        {
            "detectedLanguage":{
                "confidenceScore":0.62,
                "iso6391Name":"en",
                "name":"English"
            },
            "id":"1",
            "warnings":[
                
            ]
        },
        {
            "detectedLanguage":{
                "confidenceScore":1.0,
                "iso6391Name":"fr",
                "name":"French"
            },
            "id":"2",
            "warnings":[
                
            ]
        }
    ],
    "errors":[
        
    ],
    "modelVersion":"2022-10-01"
}

Model architecture

Transformer-based multilingual encoder architecture optimized for fast inference, scalable text processing, and high-accuracy language identification.

More information

Responsible AI considerations

Safety techniques

N/A

Safety evaluations

N/A

Known limitations

Depending on your scenario and input data, you could experience different levels of performance. The following information is designed to help you understand key concepts about performance as they apply to using Azure Language's language detection.

System limitations and best practices for enhancing performance

For inputs that include mixed-language content only a single language is returned. In general the language with the largest representation in the content is returned, but with a lower confidence score.
The service does not yet support the romanized versions of all languages that do not use the Latin script. For example, Pinyin is not supported for Chinese and Franco-Arabic is not supported for Arabic.
Some words exist in multiple languages. For example, "impossible" is common to both English and French. For short samples that include ambiguous words, you may not get the right language.
If you have some idea about the country or region of origin of your text, and you encounter mixed languages, you can use the countryHint parameter to pass in a 2 letter country/region code.
In general longer inputs are more likely to be correctly recognized. Full phrases or sentences are more likely to be correctly recognized than single words or sentence fragments.
Not all languages will be recognized. Be sure to check the list of supported languages and scripts.
To distinguish between multiple scripts used to write certain languages such as Kazakh, the language detection feature returns a script name and script code according to the ISO 15924 standard for a limited set of scripts.
The service supports language detection of text only if it is in native script. For example, Pinyin is not supported for Chinese and Franco-Arabic is not supported for Arabic.
Due to unknown gaps in our training data, certain dialects and language varieties less represented in web data may not be properly recognized.

Acceptable use

Acceptable use policy

Microsoft wants to help you responsibly develop and deploy solutions that use Azure Language. We are taking a principled approach to upholding personal agency and dignity by considering the fairness, reliability & safety, privacy & security, inclusiveness, transparency, and human accountability of our AI systems. These considerations are in line with our commitment to developing Responsible AI. This article discusses Azure Language features and the key considerations for making use of this technology responsibly. Consider the following factors when you decide how to use and implement AI-powered products and features.

General guidelines

When you're getting ready to deploy AI-powered products or features, the following activities help to set you up for success:

Understand what it can do: Fully assess the capabilities of any AI model you are using to understand its capabilities and limitations. Understand how it will perform in your particular scenario and context.
Test with real, diverse data: Understand how your system will perform in your scenario by thoroughly testing it with real life conditions and data that reflects the diversity in your users, geography and deployment contexts. Small datasets, synthetic data and tests that don't reflect your end-to-end scenario are unlikely to sufficiently represent your production performance.
Respect an individual's right to privacy: Only collect data and information from individuals for lawful and justifiable purposes. Only use data and information that you have consent to use for this purpose.
Legal review: Obtain appropriate legal advice to review your solution, particularly if you will use it in sensitive or high-risk applications. Understand what restrictions you might need to work within and your responsibility to resolve any issues that might come up in the future. Do not provide any legal advice or guidance.
System review: If you're planning to integrate and responsibly use an AI-powered product or feature into an existing system of software, customers or organizational processes, take the time to understand how each part of your system will be affected. Consider how your AI solution aligns with Microsoft's Responsible AI principles.
Human in the loop: Keep a human in the loop, and include human oversight as a consistent pattern area to explore. This means constant human oversight of the AI-powered product or feature and maintaining the role of humans in decision-making. Ensure you can have real-time human intervention in the solution to prevent harm. This enables you to manage where the AI model doesn't perform as required.
Security: Ensure your solution is secure and has adequate controls to preserve the integrity of your content and prevent unauthorized access.
Customer feedback loop: Provide a feedback channel that allows users and individuals to report issues with the service once it's been deployed. Once you've deployed an AI-powered product or feature it requires ongoing monitoring and improvement – be ready to implement any feedback and suggestions for improvement.

Terms of Service

Terms of Service Link

Your use of the Azure service is governed by the terms and conditions of the agreement under which you obtained the services.

For customers who purchase or renew a subscription (including free trials) online from Microsoft, your use is governed by either the Microsoft Customer Agreement ("MCA"), or the Microsoft Online Subscription Agreement ("MOSA"). Your use is governed by the latter if the MCA is not available in your geography. Visit the MCA page for availability details.
For customers who purchase through another Microsoft Commercial Licensing Program, such as an Enterprise Agreement, your use is governed by the licensing agreement under which you purchased the services. You can obtain a copy of your - licensing agreement by contacting your Microsoft account representative or Commercial Licensing.
If you do not have an Azure subscription, the Microsoft Terms of Use will govern your use of the limited Azure services which can be used without a subscription.

Model Specifications

Last UpdatedOctober 2025

Input TypeText

Output TypeText

ProviderMicrosoft

Quick Start