Overview
Meta AI—the research arm of Meta Platforms—created the Llama family of open‑weight large language models, first launched in February 2023. Today the lineup ranges from lightweight 7 B models to the 405 B‑parameter Llama 3.1, with the natively multimodal Llama 4 Scout and Maverick unveiled in April 2025, stretching context windows to 256 K tokens and rivaling proprietary front‑runners on reasoning benchmarks. An open, source‑available license has catalyzed a 10× surge in cloud usage through 2024, making Llama the most widely deployed open LLM for edge and enterprise workloads alike.Key Meta Models (July 2025)
- Llama 3.1‑Instruct‑405B – The largest openly available model for long‑form generation.
- Llama 4 Scout (67 B) – Vision‑text reasoning with 256 K tokens for agentic apps.
- Llama 4 Maverick (34 B) – Tuned for high‑speed chat and multilingual support.
Why Meta on Azure
Run open weights under your own subscription, integrate with Azure GPU fleets, and fine‑tune with proprietary data while staying inside your compliance boundary.Llama 4 Maverick 17B 128E Instruct FP8 is great at precise image understanding and creative writing, offering high quality at a lower price compared to Llama 3.3 70B
Llama 3.3 70B Instruct offers enhanced reasoning, math, and instruction following with performance comparable to Llama 3.1 405B.
Llama 4 Scout 17B 16E Instruct is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.
Llama 4 Scout 17B 16E is great at multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.
Excels in image reasoning capabilities on high-res images for visual understanding apps.
Advanced image reasoning capabilities for visual understanding agentic apps.
The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Key capabilities About this model Code Llama and its variants is intended for commercial and research use in English and relevant programming languages. The base model Code Llama can be adapted for a variety of code synthesis and understanding tasks, Code Llama Python is designed specifica
Key capabilities About this model Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistantlike chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a dataset of 11 million images and 1.1 billi
Key capabilities About this model Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and out
Key capabilities About this model Prompt Guard is a classifier model trained on a large corpus of attacks, capable of detecting both explicitly malicious prompts as well as data that contains injected inputs. The model is useful as a starting point for identifying and guardrailing against the
Key capabilities About this model Llama Guard 31B is a finetuned Llama3.21B pretrained model for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It a
Inference samples Evaluation samples Key capabilities About this model CodeLlama70binstruct model is designed for general code synthesis and understanding. Key model capabilities The provider has not supplied this information. Use cases See Responsible AI for additional consider
Inference Samples Key capabilities About this model Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistantlike chat, whereas pretrained models can be adapted for a variety of natural language generation ta
The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a dataset of 11 million images and 1.1 billi
DeiT (Dataefficient image Transformers) is an image transformer that do not require very large amounts of data for training. This is achieved through a novel distillation procedure using teacherstudent strategy, which results in high throughput and accuracy. DeiT is pretrained and finetuned on I
Key capabilities About this model This is a static model trained on an offline dataset. Future versions of Code Llama Instruct will be released as we improve model safety with community feedback. Key model capabilities The base model Code Llama can be adapted for a variety of code syn
A versatile 8-billion parameter model optimized for dialogue and text generation tasks.
Key capabilities About this model Code Llama comes in three model sizes, and three variants: Code Llama: base models designed for general code synthesis and understanding Code Llama Python: designed specifically for Python Code Llama Instruct: for instruction following and safer de
The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Key capabilities About this model Llama Guard 3 is a Llama3.18B pretrained model, finetuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It act
Key capabilities About this model Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistantlike chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and
Key capabilities About this model The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Instruction tuned models are intended for assistantlike chat, whereas pretrained models can
Key capabilities About this model Code Llama is a collection of pretrained and finetuned generative text models ranging in scale from 7 billion to 34 billion parameters. This model is designed for general code synthesis and understanding. Key model capabilities Code Llama: base models
Key capabilities About this model Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistantlike chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and
A powerful 70-billion parameter model excelling in reasoning, coding, and broad language applications.
Vision Transformer (basesized model) trained using DINOv2 Vision Transformer (ViT) model trained using the DINOv2 method. It was introduced in the paper <a href="https://arxiv.org/abs/2304.07193"DINOv2: Learning Robust Visual Features without Supervision by Oquab et al.</a and first released in
Key capabilities About this model Code Llama comes in three model sizes, and three variants: Code Llama: base models designed for general code synthesis and understanding Code Llama Python: designed specifically for Python Code Llama Instruct: for instruction following and safer de
Key capabilities About this model Model developer: Meta Model Release Date: July 23, 2024. Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. Key model capabil
Key capabilities About this model Code Llama comes in three model sizes, and three variants: Code Llama: base models designed for general code synthesis and understanding Code Llama Python: designed specifically for Python Code Llama Instruct: for instruction following and safer de
Llama Guard 311Bvision Model Card Model Details Built with Llama Llama Guard 3 Vision is a Llama3.211B pretrained model, finetuned for content safety classification. Similar to previous versions [13], it can be used to safeguard content for both LLM inputs (prompt classification) a
Code Llama is a collection of pretrained and finetuned generative text models ranging in scale from 7 billion to 70 billion parameters. CodeLlama70b model is designed for general code synthesis and understanding. Ethical Considerations and Limitations Code Llama and its variants are a new techn
Code Llama is a collection of pretrained and finetuned generative text models ranging in scale from 7 billion to 70 billion parameters. CodeLlama70bPython model is designed for general code synthesis and understanding. Limitations and Biases Code Llama and its variants are a new technology tha
Key capabilities About this model This model is designed for general code synthesis and understanding. This is a static model trained on an offline dataset. Future versions of Code Llama Instruct will be released as we improve model safety with community feedback. Key model capabilities
The Vision Transformer (ViT) is a transformer encoder model (BERTlike) pretrained on a large collection of images in a selfsupervised fashion with the DinoV2 method. Images are presented to the model as a sequence of fixedsize patches, which are linearly embedded. One also adds a [CLS] token to
The Vision Transformer (ViT) is a transformer encoder model (BERTlike) pretrained on a large collection of images in a selfsupervised fashion with the DinoV2 method. Images are presented to the model as a sequence of fixedsize patches, which are linearly embedded. One also adds a [CLS] token to
Key capabilities About this model Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistantlike chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and
Key capabilities About this model Code Llama and its variants is intended for commercial and research use in English and relevant programming languages. The base model Code Llama can be adapted for a variety of code synthesis and understanding tasks, Code Llama Python is designed specifical
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a dataset of 11 million images and 1.1 billi
Key capabilities About this model Code Llama comes in three model sizes, and three variants: Code Llama: base models designed for general code synthesis and understanding Code Llama Python: designed specifically for Python Code Llama Instruct: for instruction following and safer de