AI Model Catalog | Microsoft Foundry Models

Description

Microsoft TRELLIS 3D is an asset generation model capable of producing detailed meshes directly from text prompts or images. With multiple size variants, TRELLIS offers options for users aiming to maximize quality and/or speed. This model is ready for non-commercial/commercial use. Microsoft TRELLIS 3D is available as an NVIDIA NIM™ microservice, part of NVIDIA AI Enterprise . NVIDIA NIM offers prebuilt containers for large language models (LLMs) that can be used to develop chatbots, content analyzers—or any application that needs to understand and generate human language. Each NIM consists of a container and a model and uses a CUDA-accelerated runtime for all NVIDIA GPUs, with special optimizations available for many configurations. NVIDIA NIM is the fastest way to achieve accelerated generative AI inference at scale and has been benchmarked to have up to 2.6x improved throughput latency. NVIDIA AI Enterprise
NVIDIA AI Enterprise is an end-to-end, cloud-native software platform that accelerates data science pipelines and streamlines development and deployment of production-grade co-pilots and other generative AI applications. Easy-to-use microservices provide optimized model performance with enterprise-grade security, support, and stability to ensure a smooth transition from prototype to production for enterprises that run their businesses on AI.

Third-Party Community Consideration:

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to:

Deployment Geography:

Global

Use Case:

Creators and professionals can use this model to generate high-quality images from text prompts, simplifying visual communication.

Release Date:

Build.Nvidia.com September 2, 2025 via https://build.nvidia.com/microsoft/trellis
Huggingface December 2, 2024 via https://huggingface.co/microsoft/TRELLIS-image-large

References

TRELLIS project page

Model Architecture:

Architecture Type: Transformer
Network Architecture: Sparse Flow Transformer
Number of model parameters: TRELLIS-image-large-1.2B, TRELLIS-text-base-342M, TRELLIS-text-large 1.1B.

Input:

Input Type: Text, Image
Input Parameters: Text: One-Dimensional (1D); Image: Two-Dimensional (2D)
Input Format: Text: String. Image: Red, Green, Blue (RGB)
Other Properties Related to Input: Steps, Classifier-Free Guidance Scale and Seed

Output:

Output Type: 3D Object
Output Parameters: Three-Dimensional (3D)
Output Format: Graphics Library Binary (GLB)

Software Integration:

Runtime Engines:

Pytorch

Supported Hardware Platforms:

NVIDIA Blackwell
NVIDIA Hopper
NVIDIA Lovelace

Supported Operating Systems:

Linux
Windows Subsystem for Linux

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

Model Version(s):

TRELLIS-image-large
TRELLIS-text-large

Training Dataset:

Link: https://huggingface.co/datasets/JeffreyXiang/TRELLIS-500K Data Modality: Image, Text, 3D Objects Data Collection Method by dataset: Automated Labeling Method by dataset: Automated Properties (Quantity, Dataset Descriptions, Sensor(s)): TRELLIS-500K is a dataset of 500K 3D assets curated from Objaverse(XL), ABO, 3D-FUTURE, HSSD, and Toys4k, filtered based on aesthetic scores.

Testing Dataset:

Link: https://huggingface.co/datasets/JeffreyXiang/TRELLIS-500K Data Modality: Image, Text, 3D Objects Data Collection Method by dataset: Automated Labeling Method by dataset: Automated Properties (Quantity, Dataset Descriptions, Sensor(s)): TRELLIS-500K is a dataset of 500K 3D assets curated from Objaverse(XL), ABO, 3D-FUTURE, HSSD, and Toys4k, filtered based on aesthetic scores.

Evaluation Dataset:

Link: https://huggingface.co/datasets/JeffreyXiang/TRELLIS-500K Data Modality: Image, Text, 3D Objects Data Collection Method by dataset: Automated Labeling Method by dataset: Automated Properties (Quantity, Dataset Descriptions, Sensor(s)): TRELLIS-500K is a dataset of 500K 3D assets curated from Objaverse(XL), ABO, 3D-FUTURE, HSSD, and Toys4k, filtered based on aesthetic scores.

Inference:

Engine: Pytorch
Test Hardware: L40S

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

Microsoft TRELLIS 3D NIM is optimized to run best on the following compute:

GPU	Total GPU memory	Azure VM compute	#GPUs on VM	Link
A100	80	Standard_NC24ads_A100_v4	1	link
H100	94	STANDARD_NC40ADS_H100_V5	1	link