Llama-Guard-3-1B
Key capabilities
About this model
Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.Key model capabilities
The model is trained to predict safety labels on the 13 categories shown below, based on the MLCommons taxonomy of 13 hazards.| Hazard categories | |
|---|---|
| S1: Violent Crimes | S2: Non-Violent Crimes |
| S3: Sex-Related Crimes | S4: Child Sexual Exploitation |
| S5: Defamation | S6: Specialized Advice |
| S7: Privacy | S8: Intellectual Property |
| S9: Indiscriminate Weapons | S10: Hate |
| S11: Suicide & Self-Harm | S12: Sexual Content |
| S13: Elections | |
See Responsible AI for additional considerations for responsible use.
Key use cases
Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.Out of scope use cases
There are some limitations associated with Llama Guard 3-1B. First, Llama Guard 3-1B itself is an LLM fine-tuned on Llama 3.2. Thus, its performance (e.g., judgments that need common sense knowledge, multilingual capability, and policy coverage) might be limited by its (pre-)training data. Llama Guard performance varies across model size and languages. When possible, developers should consider Llama Guard 3-8B which may provide better safety classification performance but comes at a higher deployment cost. Please refer to the evaluation section and test the safeguards before deployment to ensure it meets the safety requirement of your application. Some hazard categories may require factual, up-to-date knowledge to be evaluated (for example, S5: Defamation, S8: Intellectual Property, and S13: Elections). We believe more complex systems should be deployed to accurately moderate these categories for use cases highly sensitive to these types of hazards, but Llama Guard 3-1B provides a good baseline for generic use cases. Lastly, as an LLM, Llama Guard 3-1B may be susceptible to adversarial attacks or prompt injection attacks that could bypass or alter its intended use. Please report vulnerabilities and we will look to incorporate improvements in future versions of Llama Guard.Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.