AI Model Catalog | Microsoft Foundry Models

facebook-sam3

Version: 8

Hugging Face•Last updated January 2026

Gated Model Access Required facebook/sam3 requires special access approval from the authors through Hugging Face. To use this model, you must:

Request access through the model page on Hugging Face and wait for approval from the model authors.
Create a Custom keys workspace connection in Microsoft Foundry or Azure Machine Learning named HuggingFaceTokenConnection with the key HF_TOKEN and value your Hugging Face read or fine-grained token (marked as secret).
Create the Managed Online Endpoint with the property enforce_access_to_default_secret_stores set to enabled so it can access the secret connection value.
Once access is approved, the connection is configured, and the endpoint is created with read access to the token, you can deploy and use the model in Microsoft Foundry or Azure Machine Learning.

Gated Model Access Required facebook/sam3 requires special access approval from the authors through Hugging Face. To use this model, you must:

Request access through the model page on Hugging Face and wait for approval from the model authors.
Create a Custom keys workspace connection in Microsoft Foundry or Azure Machine Learning named HuggingFaceTokenConnection with the key HF_TOKEN and value your Hugging Face read or fine-grained token (marked as secret).
Create the Managed Online Endpoint with the property enforce_access_to_default_secret_stores set to enabled so it can access the secret connection value.
Once access is approved, the connection is configured, and the endpoint is created with read access to the token, you can deploy and use the model in Microsoft Foundry or Azure Machine Learning.

facebook/sam3 powered by Hugging Face API

Mask Generation

Send Request

You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.

curl <AZUREML_ENDPOINT_URL> \
    -X POST \
    -H "Authorization: Bearer <AZUREML_TOKEN>" \
    -H "Content-Type: application/json" \
    -d '{"inputs":"http://images.cocodataset.org/val2017/000000077595.jpg","parameters":{"mask_threshold":0.5}}'

Supported Parameters

inputs (string or bytes): The input image, which can be a URL, a local file path, or raw bytes representing an image.
parameters (object, optional):
- mask_threshold (float, optional): Threshold applied to the predicted masks to convert them into binary values. Defaults to 0.0.
- pred_iou_thresh (float, optional): Filtering threshold in [0, 1] applied to the predicted mask IoU/quality scores. Defaults to 0.88.
- stability_score_thresh (float, optional): Filtering threshold in [0, 1] using the stability of the mask under variations of the binarization cutoff. Defaults to 0.95.
- stability_score_offset (integer, optional): Amount by which the cutoff is shifted when computing the stability score. Defaults to 1.
- crops_nms_thresh (float, optional): Box IoU cutoff used by non-maximum suppression to remove duplicate masks across crops. Defaults to 7.
- crops_n_layers (integer, optional): Number of crop layers; if greater than 0, the image is recursively cropped into (2^{\text{layer}}) regions per layer for additional mask prediction passes. Defaults to 0.
- crop_overlap_ratio (float, optional): Fraction of the image side length by which crops overlap in the first crop layer, scaled accordingly for deeper layers. Defaults to 512 / 1500 i.e., 0.3413.
- crop_n_points_downscale_factor (integer, optional): Factor controlling how the number of points-per-side decreases with each crop layer, using (\text{factor}^n) at layer (n). Defaults to 1.
- timeout (float, optional): Maximum time in seconds to wait when fetching remote images; if null, no timeout is enforced and the call may block indefinitely.

Expected output

The mask-generation Hugging Face API will generate a JSON with the key results containing a list of dicts with the keys mask and score per each of the generated masks as follows:

mask (str): The generated mask over the original image encoded in base64. If the image is blank, it means that no objects have been found.
score (float, optional): Optionally, when the model is capable of estimating a confidence of the detected object described by the mask.

So that once those the masks are generated, those can be printed above the original image. Also note that if results is empty, that means that no objects have been identified in the image.

Promptable Concept Segmentation (PCS)

Besides mask-generation, facebook/sam3 can also perform Promptable Concept Segmentation (PCS), under the /promptable-concept-segmentation route, which is a prompt-based segmentation where a text is provided along with the image to segment, so that the model looks for that concept in the image and segments it accordingly. Note that /promptable-concept-segmentation is a custom route only included for certain models as facebook/sam3.

Send Request

You can use cURL or any REST Client to send a request to the Azure ML endpoint with your Azure ML token.

curl <AZUREML_ENDPOINT_URL>/promptable-concept-segmentation \
    -X POST \
    -H "Authorization: Bearer <AZUREML_TOKEN>" \
    -H "Content-Type: application/json" \
    -d '{"inputs":{"image":"http://images.cocodataset.org/val2017/000000077595.jpg","text":"cat"},"parameters":{"mask_threshold":0.5}}'

Supported Parameters

inputs (object):
- image (string or bytes): The input image, which can be a URL, a local file path, or raw bytes representing an image.
- text (string): The concept to find and segment within the provided image.
parameters (object, optional):
- threshold (float, optional): Score threshold to keep instance predictions. Defaults to 0.3.
- mask_threshold (float, optional): Threshold for binarizing the predicted masks. Defaults to 0.5.

Expected output

The promptable-concept-segmentation Hugging Face API will generate a JSON with the key results containing a list of dicts with the keys mask, score and box per each of the generated masks for the given concept as follows:

mask (str): The generated mask over the original image encoded in base64. If the image is blank, it means that no objects have been found.
score (float, optional): Optionally, when the model is capable of estimating a confidence of the detected object described by the mask.
box ([float, float, float, float], optional): Optionally, when the model is capable of generating the bounding boxes in absolute pixel coordinates (xyxy format).

So that once those the masks and boxes are generated, those can be printed above the original image. Also note that if results is empty, that means that no objects have been identified in the image.

Model Specifications

LicenseOther

Last UpdatedJanuary 2026

ProviderHugging Face

Quick Start