Azure-Language-Document-PII-redaction

PII Redaction for Documents automatically detects and masks sensitive information such as names, addresses, phone numbers, credit card details, and other personally identifiable information (PII) in native documents including PDF, Word, and text files.

Microsoft

Version: 1

Azure Language

Azure Language adds advanced natural language processing to your apps using task‑optimized AI models. It helps you extract key information from text, transcripts, and files as well as detect language to build multilingual, conversational experiences—all with enterprise‑grade security and flexible customization.

Key capabilities

About this model

The Document PII Redaction model in Azure Language automatically detects and masks sensitive information in native documents (PDF, Word, and plain text files), ensuring privacy and compliance. It is designed for batch processing, making it ideal for enterprise workflows that require secure handling of personal data in documents without requiring text preprocessing.

Key model capabilities

Native Document Support: Processes PDF (.pdf), Microsoft Word (.docx), and plain text (.txt) files directly, eliminating the need for text preprocessing.
Comprehensive PII Detection: Identifies a wide range of sensitive entities like names, addresses, IDs, and financial data.
Automatic Redaction: Replaces detected PII with placeholders or entity type masks to prevent exposure in downstream systems.
Multilingual Support: Detects PII across multiple languages for global applications.
Seamless Integration: Works with REST APIs, SDKs, and Azure AI Foundry Tools for easy deployment and scaling.

Use cases

Pricing

Technical specs

Distribution

More information

Quick facts

Model providerMicrosoft

TypeDocument pii extraction

LifecycleGenerally available (GA)

Input typetext

Output typetext

PricingView pricing

Azure-Language-Document-PII-redaction

About this model

Key model capabilities

Quick facts

Quick start