RosettaFold 3
RosettaFold 3
Version: 1
UW Institute for Protein DesignLast updated November 2025
RosettaFold3 (RF3) is a biomolecular structure prediction model, like AlphaFold3, that can fold proteins and other complex molecules, going from 1-dimensional information to 3D shape. RF3 consists of a diffusion-based neural network capable of predicting a
RosettaFold3 (RF3) is a deep learning model for biomolecular structure prediction and modeling, designed to handle proteins, nucleic acids, and small-molecule ligands within the same framework. It is particularly well-suited for tasks that involve multi-chain protein complexes, protein–RNA and protein–DNA assemblies, and protein–ligand docking, making it valuable for drug discovery and structural biology applications, with benchmarking performance close to AlphaFold3.

Inference with RosettaFold3(RF3)

RF3 is an all-atom biomolecular structure prediction network competitive with leading open-source models. By including additional features at train-time – implicit chirality representations and atom-level geometric conditioning – we improve performance on tasks such as prediction of chiral ligands and fixed-backbone or fixed-conformer docking. For more information, please see the preprint, Accelerating Biomolecular Modeling with AtomWorks and RF3 . This guide provides instructions on preparing inputs and processing outputs.

How to run on Azure Foundry

Doing inference with RF3 on Azure Foundry relies on an HTTP API via POST/GET queries, where the results from a query are available for 1 hour after the job completes. Please obtain the endpoint from your deployment. Assuming that is the endpoint of your deployment, you can use POST requests to query the server side, for example:
POST <ENDPOINT>/async/score
This initial POST request will assign a unique identifier to your query (say "request_id") which can be used to retrieve the results of the structure prediction as follows using a GET query.
GET <ENDPOINT>/async/score/{request_id}
Depending on the status of the processing, you will see the following status messages: processing, success and failed. Once the job is complete, a CURL request will fetch the results in a JSON. The JSON will contain the predicted confidence scores and metrics (pLDDT, PAE, iPTM etc.) and the predicted structure (.cif.gz format). Examples below show how to query using a curl request:
curl -X POST  <ENDPOINT>/async/score  -H "Authorization: Bearer <KEY>"  -H "Content-Type: application/json"  -d '<QUERY_STRING_JSON_FORMAT>'
We show several examples of the query JSON in sections below.
curl -X POST  <ENDPOINT>/async/score  -H "Authorization: Bearer <KEY>"  -H "Content-Type: application/json"  -d '[ 
    {
        "name": "example-without-msa",
        "components": [
            {
                "seq": "MTSENPLLALREKISALDEKLLALFAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLITLGKAHHLDAH(PBF)ITRTFQLGIEYSVLTQQALLEHHHHHH",
                "chain_id": "A"
            },
            {
                "seq": "MTSENPLLALREKISALDEKLLALFAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLITLGKAHHLDAH(PBF)ITRTFQLGIEYSVLTQQALLEHHHHHH",
                "chain_id": "B"
            }
        ]
    }
]'

Retrieving results of a completed job

In order to fetch the JSON with the predicted structure, use the "request_id" assigned to your query as follows:
curl <ENDPOINT>/async/score/<request_id> -H "Authorization: Bearer <KEY>"
The returned JSON can be redirected to a local file, for instance:
curl <ENDPOINT>/async/score/<request_id> -H "Authorization: Bearer <KEY>"  >  result.json

Processing the output JSON file

After running inference using Azure Foundry, the output JSON file will have several keys describing the results. These will be names of the 5 structure files generated for each of the 5 diffusion seeded models (.cif.gz extension files). These will be encoded in Base64 format as values in the JSON output. Please use the following jupyternotebook to process this JSON output and obtain the result files.
https://github.com/pabhatia-ms/rf3-object/blob/main/aml-rf3-modelforge.ipynb
After running the above jupternotebook to post-process the JSON, you should see several files:
  • 5vht_from_json_metrics.csv — overall confidence metrics for this example
  • 5vht_from_json.score - more granular confidence metrics for this example
  • 5vht_from_json_model_0.cif.gz - zipped model prediction for the first diffusion seed (PyMol can open .gz files directly)
  • 5vht_from_json_model_1.cif.gz - zipped model prediction for the second diffusion seed
  • ...
For this example, the pTM in the metrics.csv should be >0.8 (even without an MSA); if not, there may be something wrong with your setup.

Viewing the Predicted Structure(s)

Use the following code to view the predicted structures with AtomWorks framework from:
from atomworks.io.utils.visualize import view
from atomworks.io import parse

# View in atomworks (or PyMol, etc.)
out = parse("path/to/prediction.cif.gz")
atom_array = out["assemblies"]["1"][0]
# (If in a notebook)
view(atom_array)
Alternative viewing options:
  • View in PyMol like normal, or using pymol_remote
  • Use the view_pymol() function for direct PyMol integration

More examples for other usecases

Folding with only the protein sequence

{
    "name": "example_protein",
    "components": [
        {
            "seq": "AINRLQLVATLVEREVRYTPAGVPIVNCLLSYSGQAEAQAARQVEFSIEALGAGKASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
            "chain_id": "A"
        }
    ]
}
📝 Example JSON configuration for two sequences
{
    "name": "3en2_from_json",
    "components": [
        {
            "seq": "AINRLQLVATLVEREV(MSE)RYTPAGVPIVNCLLSYSGQA(MSE)EAQAARQVEFSIEALGAGK(MSE)ASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
            "chain_id": "A"
        },
        {
            "seq": "AINRLQLVATLVEREV(MSE)RYTPAGVPIVNCLLSYSGQA(MSE)EAQAARQVEFSIEALGAGK(MSE)ASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
            "chain_id": "B"
        }
    ]
}

For Stand-alone Installation, Setup, and a Basic Prediction, please refer to our github repository: https://github.com/RosettaCommons/modelforge

Folding Many Inputs

When running multiple predictions, you'll notice that the startup cost (importing libraries, loading models, initializing CUDA) often takes significantly longer than the actual prediction itself. Instead of running separate commands, you can process multiple inputs in a single command to amortize the startup cost. Multiple inputs can be provided in three ways: Within a single JSON file - Define multiple examples in one configuration

1️⃣ Single JSON with Multiple Examples

📝 Example JSON configuration (full example found at https://github.com/RosettaCommons/modelforge/tree/production/docs/rf3/examples/multiple_example_from_json.json)
[
    {
        "name": "multiple_examples_from_json(1)",
        "components": [
            {
                "seq": "MNAKEIVVHALRLLENGDARGWCDLFHPEGVLEYPYPPPGYKTRFEGRETIWAHMRLFPEYMTIRFTDVQFYETADPDLAIGEFHGDGVHTVSGGKLAADYISVLRTRDGQILLYRLFFNPLRVLEPLGLEHHHHHH",
                "chain_id": "A"
            },
            {
                "smiles": "O=C1OCC(=C1)C5C4(C(O)CC3C(CCC2CC(O)CCC23C)C4(O)CC5)C"
            }
        ]
    },
    {
        "name": "multiple_examples_from_json(2)",
        "components": [
            {
                "seq": "GSGVSLGQALLILSVAALLGTTVEEAVKRALWLKTKLGVSLEQAARTLSVAAYLGTTVEEAVKRALKLKTKLGVSLEQALLILFAAAALGTTVEEAVKRALKLKTKLGVSLEQALLILWTAVELGTTVEEAVKRALKLKTKLGVSLGQAQAILVVAAELGTTVEEAVYRALKLKTKLGVSLGQALLILEVAAKLGTTVEEAVKRALKLTTKLG",
                "chain_id": "A"
            },
            {
                "ccd_code": "MG"
            }
        ]
    }
]

Responsible AI Considerations

All datasets used in this model's development are publicly available and do not contain any PHI or PII data or any other sensitive personal data.
RosettaFold3 (RF3) was evaluated against state-of-the-art recent models: AlphaFold3 and Boltz-1/2 on the tasks of protein-protein interaction prediction, protein-ligand interaction prediction on held out test sets from the PDB. Below are the results on these two benchmarks. Further benchmark performance and details of the evaluation can be found in the paper at: https://www.biorxiv.org/content/10.1101/2025.08.14.670328 The source code for this model is available on the github repository at:
https://github.com/RosettaCommons/modelforge
CategoryBenchmarkRosettaFold3 (RF3)Boltz-1xBoltz-2AlphaFold3
Interface lDDTProtein-Protein interaction0.610.530.550.66
Interface lDDTProtein-Ligand interaction0.800.670.750.86
Chiral center accuracyProtein Data Bank (PDB) test set83 %-na-78 %80 %
Model Specifications
LicenseCustom
Last UpdatedNovember 2025
Input TypeText
Output TypeText
ProviderUW Institute for Protein Design
Languages1 Language