Azure-Language-Text-Analytics-for-Health
Version: 1
Azure Language
Azure Language adds advanced natural language processing to your apps using task‑optimized AI models. It helps you extract key information from text, transcripts, and files as well as detect language to build multilingual, conversational experiences—all with enterprise‑grade security and flexible customization.Key capabilities
About this model
The Text Analytics for Health model in Azure Language extracts and labels relevant medical information from unstructured clinical text. It performs named entity recognition, relation extraction, entity linking, and assertion detection to surface structured insights from doctors' notes, discharge summaries, clinical documents, and electronic health records. It supports both real-time and batch processing, making it suitable for a wide range of healthcare and life sciences workflows.Key model capabilities
- Named Entity Recognition: Identifies medical entities such as diagnoses, medications, symptoms/signs, body structures, dosages, and Social Determinants of Health (SDOH).
- Relation Extraction: Detects semantic relationships between medical entities (e.g., medication–dosage, diagnosis–body structure, treatment–condition).
- Entity Linking: Maps recognized entities to standardized medical ontologies via the Unified Medical Language System (UMLS) for interoperability.
- Assertion Detection: Identifies negation, uncertainty, conditionality, and association in clinical text to correctly contextualize extracted entities.
- FHIR Support: Returns results in the Fast Healthcare Interoperability Resources (FHIR 4.0.1) format for seamless EHR integration.
- Multilingual Support: Processes clinical text in English and additional preview languages for global healthcare applications.
Use cases
See Responsible Use of AI for additional considerations for responsible use.Key use cases
- Clinical Document Analysis: Extract structured data from unstructured EHR notes, discharge summaries, and clinical trial documents.
- Healthcare Data Pipelines: Power downstream analytics by converting free-text clinical notes into structured, searchable data.
- Medical Coding Assistance: Surface diagnoses, procedures, and medications from notes to assist with clinical coding workflows.
- Drug Safety & Pharmacovigilance: Identify adverse events, medications, and dosages in patient records and medical literature.
- Population Health Analytics: Aggregate and analyze SDOH and clinical entity data across patient populations.
Out of scope use cases
The model is not intended for:- Use as a medical device, clinical support, or diagnostic tool, or for the diagnosis, cure, mitigation, treatment, or prevention of disease.
- Replacing professional medical advice, healthcare opinion, or the clinical judgment of a healthcare professional.
- Any use that violates Microsoft's Responsible Use of AI .
Pricing
Pricing is based on the number of text records processed and the selected tier. See the Azure pricing page for more details.Technical specs
Text Analytics for Health is a cloud-based service using advanced transformer-based NLP models pre-trained on clinical and biomedical text. It supports named entity recognition across a wide range of healthcare entity categories, relation extraction between co-occurring entities, entity linking to UMLS, and assertion detection for negation, uncertainty, and conditionality. The service can be accessed via REST API, Azure SDKs, Microsoft Foundry, or deployed on-premises using Docker containers.Input formats
The Text Analytics for Health model expects UTF-8 encoded plain text as input. You can interact with the model through the Foundry portal, REST API (JSON payload), SDKs (available for .NET, Python, Java, and JavaScript), or a self-hosted Docker container.Supported languages
The feature supports English for general availability, with additional languages available in preview. See the full list of supported languages linked here .Supported Azure regions
See the full list of supported Azure regions for Azure Language linked here .Sample JSON response
Sample input
{
"documents": [
{
"language": "en",
"id": "1",
"text": "Patient is a 45-year-old male diagnosed with Type 2 diabetes mellitus. He is currently taking Metformin 500mg twice daily."
}
]
}
Sample output
{
"results": {
"documents": [
{
"id": "1",
"entities": [
{
"offset": 14,
"length": 13,
"text": "45-year-old",
"category": "Age",
"confidenceScore": 0.98
},
{
"offset": 34,
"length": 4,
"text": "male",
"category": "Gender",
"confidenceScore": 0.99
},
{
"offset": 53,
"length": 25,
"text": "Type 2 diabetes mellitus",
"category": "Diagnosis",
"confidenceScore": 0.97,
"links": [
{
"dataSource": "UMLS",
"id": "C0011860"
}
]
},
{
"offset": 95,
"length": 9,
"text": "Metformin",
"category": "MedicationName",
"confidenceScore": 0.99
},
{
"offset": 105,
"length": 5,
"text": "500mg",
"category": "Dosage",
"confidenceScore": 0.98
},
{
"offset": 111,
"length": 11,
"text": "twice daily",
"category": "Frequency",
"confidenceScore": 0.97
}
],
"relations": [
{
"relationType": "DosageOfMedication",
"entities": [
{ "ref": "#/results/documents/0/entities/3", "role": "Medication" },
{ "ref": "#/results/documents/0/entities/4", "role": "Dosage" }
]
}
],
"warnings": []
}
],
"errors": [],
"modelVersion": "2022-08-15"
}
}
Model architecture
Transformer-based multilingual NER architecture pre-trained on biomedical and clinical corpora, fine-tuned for health entity recognition, relation extraction, entity linking to UMLS, and assertion detection in clinical text.Long context
For synchronous requests, Text Analytics for Health supports up to 5,120 characters per document. For asynchronous requests, up to 125,000 characters per document. Results from asynchronous requests are available for 24 hours after ingestion.Optimizing model performance
Efficiency
- Batch Processing: Combine multiple documents into a single API call to reduce network overhead and improve throughput.
- Asynchronous API: Use asynchronous requests for large documents or high-volume workloads to maximize throughput.
- Docker Container: For on-premises or air-gapped scenarios, deploy the Docker container to bring the service closer to your data.
Accuracy
- Full Document Context: Submit complete clinical notes rather than fragmented sentences to ensure the model has sufficient context for accurate entity detection and relation extraction.
- FHIR Output: Use the FHIR response format for structured, standards-compliant output when integrating with EHR systems.
- Assertion Detection: Always review assertion detection results (negation, uncertainty) before using extracted entities in downstream clinical workflows.
Cost-Effectiveness
- Selective Analysis: Use the API's feature selection parameters to enable only the capabilities you need (e.g., NER only, without relation extraction or entity linking).
- Chunking Large Documents: For documents exceeding the character limit, chunk them at natural boundaries (e.g., section headers) to maintain context.
- Autoscaling & Rate Limiting: Configure autoscaling for peak loads and apply throttling to avoid unnecessary compute costs.
Additional assets
List of additional assets (e.g. training data, technical reports data processing code, model training code, model inference code, model evaluation code), if any, that are made available with a link, description of how each can be accessed and what licenses, if any, relate to their use.Distribution
More information
Responsible AI considerations
Safety techniques
N/ASafety evaluations
N/AImportant disclaimer
Text Analytics for Health is provided "AS IS" and "WITH ALL FAULTS." It is not intended or made available for use as a medical device, clinical support, diagnostic tool, or other technology intended to be used in the diagnosis, cure, mitigation, treatment, or prevention of disease or other conditions. No license or right is granted by Microsoft to use this capability for such purposes. This capability is not designed or intended to be a substitute for professional medical advice or healthcare opinion, diagnosis, treatment, or the clinical judgment of a healthcare professional, and should not be used as such. The customer is solely responsible for any use of Text Analytics for Health. Customers must separately license any and all source vocabularies they intend to use under the terms set for the UMLS Metathesaurus License Agreement . The customer is responsible for ensuring compliance with those license terms, including any geographic or other applicable restrictions. All decisions leveraging outputs of Text Analytics for Health that impact individuals or resource allocation (including, but not limited to, those related to billing, human resources, or treatment and managing care) should be made with human oversight and should not be based solely on the findings of the model.Known limitations
Depending on your scenario, input data and the entities you wish to extract, you could experience different levels of performance. The following sections are designed to help you understand key concepts about performance as they apply to using the Azure Language Text Analytics for Health service.Understand and measure performance
Since both false positive and false negative errors can occur, it is important to understand how both types of errors might affect your overall system. In clinical workflows, false negatives could lead to missed medical entities that are relevant to patient care. False positives could introduce incorrect entities into downstream systems. You can adjust the threshold for confidence score your system uses to tune your system. Threshold values may not have consistent behavior across individual categories of health entities. Therefore, it is critical that you test your system with real clinical data it will process in production.System limitations and best practices for enhancing performance
- Make sure you understand all the health entity categories that can be recognized by the system. Your clinical data may include information that is not covered by the categories the service currently supports.
- Context is critical for clinical entity recognition. Submit complete clinical notes or full document sections rather than isolated sentences to maximize accuracy for entity detection, relation extraction, and assertion detection.
- Assertion detection (negation, uncertainty, conditionality) is important in clinical text. Always verify assertion detection outputs before using extracted entities in downstream clinical or compliance workflows.
- The service extracts Social Determinants of Health (SDOH) and ethnicity mentions in text. This capability may not cover all potential SDOH and does not derive inferences based on SDOH or ethnicity (for example, substance use information is surfaced, but substance abuse is not inferred). The SDOH extraction capability is intended to help providers improve health outcomes and should not be used to stigmatize or draw negative inferences about patient populations.
- The service is optimized for English. For other languages currently in preview, performance may vary. Consider verifying the language of your input text before processing.
- Text Analytics for Health processes plain text. If you are extracting text from clinical documents in other formats (e.g., PDF, scanned images), ensure your preprocessing accurately captures the full text without truncation or corruption.
Acceptable use
Acceptable use policy
Microsoft wants to help you responsibly develop and deploy solutions that use Azure Language. We are taking a principled approach to upholding personal agency and dignity by considering the fairness, reliability & safety, privacy & security, inclusiveness, transparency, and human accountability of our AI systems. These considerations are in line with our commitment to developing Responsible AI. This article discusses Azure Language features and the key considerations for making use of this technology responsibly. Consider the following factors when you decide how to use and implement AI-powered products and features.General guidelines
When you're getting ready to deploy AI-powered products or features, the following activities help to set you up for success:- Understand what it can do: Fully assess the capabilities of any AI model you are using to understand its capabilities and limitations. Understand how it will perform in your particular scenario and context.
- Test with real, diverse data: Understand how your system will perform in your scenario by thoroughly testing it with real life conditions and data that reflects the diversity in your users, geography and deployment contexts. Small datasets, synthetic data and tests that don't reflect your end-to-end scenario are unlikely to sufficiently represent your production performance.
- Respect an individual's right to privacy: Only collect data and information from individuals for lawful and justifiable purposes. Only use data and information that you have consent to use for this purpose. Handle all clinical and health data in accordance with applicable laws and regulations (e.g., HIPAA, GDPR).
- Legal review: Obtain appropriate legal advice to review your solution, particularly if you will use it in sensitive or high-risk applications. Understand what restrictions you might need to work within and your responsibility to resolve any issues that might come up in the future. Do not provide any legal advice or guidance.
- System review: If you're planning to integrate and responsibly use an AI-powered product or feature into an existing system of software, customers or organizational processes, take the time to understand how each part of your system will be affected. Consider how your AI solution aligns with Microsoft's Responsible AI principles.
- Human in the loop: Keep a human in the loop, and include human oversight as a consistent pattern area to explore. This means constant human oversight of the AI-powered product or feature and maintaining the role of humans in decision-making. Ensure you can have real-time human intervention in the solution to prevent harm. This is especially critical for clinical applications where AI outputs should always be reviewed by qualified healthcare professionals.
- Security: Ensure your solution is secure and has adequate controls to preserve the integrity of your content and prevent unauthorized access. Clinical and health data requires particularly robust security controls.
- Customer feedback loop: Provide a feedback channel that allows users and individuals to report issues with the service once it's been deployed. Once you've deployed an AI-powered product or feature it requires ongoing monitoring and improvement – be ready to implement any feedback and suggestions for improvement.
Terms of Service
Terms of Service Link
Your use of the Azure service is governed by the terms and conditions of the agreement under which you obtained the services.- For customers who purchase or renew a subscription (including free trials) online from Microsoft, your use is governed by either the Microsoft Customer Agreement ("MCA"), or the Microsoft Online Subscription Agreement ("MOSA"). Your use is governed by the latter if the MCA is not available in your geography. Visit the MCA page for availability details.
- For customers who purchase through another Microsoft Commercial Licensing Program, such as an Enterprise Agreement, your use is governed by the licensing agreement under which you purchased the services. You can obtain a copy of your - licensing agreement by contacting your Microsoft account representative or Commercial Licensing.
- If you do not have an Azure subscription, the Microsoft Terms of Use will govern your use of the limited Azure services which can be used without a subscription.
Model Specifications
Last UpdatedMay 2026
Input TypeText
Output TypeText
ProviderMicrosoft