RosettaFold 3
Version: 1
RosettaFold3 (RF3) is a deep learning model for biomolecular structure prediction and modeling, designed to handle proteins, nucleic acids, and small-molecule ligands within the same framework. It is particularly well-suited for tasks that involve multi-chain protein complexes, protein–RNA and protein–DNA assemblies, and protein–ligand docking, making it valuable for drug discovery and structural biology applications, with benchmarking performance close to AlphaFold3.
Inference with RosettaFold3(RF3)
RF3 is an all-atom biomolecular structure prediction network competitive with leading open-source models. By including additional features at train-time – implicit chirality representations and atom-level geometric conditioning – we improve performance on tasks such as prediction of chiral ligands and fixed-backbone or fixed-conformer docking. For more information, please see the preprint, Accelerating Biomolecular Modeling with AtomWorks and RF3 . This guide provides instructions on preparing inputs and processing outputs.How to run on Azure Foundry
Doing inference with RF3 on Azure Foundry relies on an HTTP API via POST/GET queries, where the results from a query are available for 1 hour after the job completes. Please obtain the endpoint from your deployment. Assuming that is the endpoint of your deployment, you can use POST requests to query the server side, for example:POST <ENDPOINT>/async/score
GET <ENDPOINT>/async/score/{request_id}
curl -X POST <ENDPOINT>/async/score -H "Authorization: Bearer <KEY>" -H "Content-Type: application/json" -d '<QUERY_STRING_JSON_FORMAT>'
curl -X POST <ENDPOINT>/async/score -H "Authorization: Bearer <KEY>" -H "Content-Type: application/json" -d '[
{
"name": "example-without-msa",
"components": [
{
"seq": "MTSENPLLALREKISALDEKLLALFAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLITLGKAHHLDAH(PBF)ITRTFQLGIEYSVLTQQALLEHHHHHH",
"chain_id": "A"
},
{
"seq": "MTSENPLLALREKISALDEKLLALFAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLITLGKAHHLDAH(PBF)ITRTFQLGIEYSVLTQQALLEHHHHHH",
"chain_id": "B"
}
]
}
]'
Retrieving results of a completed job
In order to fetch the JSON with the predicted structure, use the "request_id" assigned to your query as follows:curl <ENDPOINT>/async/score/<request_id> -H "Authorization: Bearer <KEY>"
curl <ENDPOINT>/async/score/<request_id> -H "Authorization: Bearer <KEY>" > result.json
Processing the output JSON file
After running inference using Azure Foundry, the output JSON file will have several keys describing the results. These will be names of the 5 structure files generated for each of the 5 diffusion seeded models (.cif.gz extension files). These will be encoded in Base64 format as values in the JSON output. Please use the following jupyternotebook to process this JSON output and obtain the result files.https://github.com/pabhatia-ms/rf3-object/blob/main/aml-rf3-modelforge.ipynb After running the above jupternotebook to post-process the JSON, you should see several files:
5vht_from_json_metrics.csv— overall confidence metrics for this example5vht_from_json.score- more granular confidence metrics for this example5vht_from_json_model_0.cif.gz- zipped model prediction for the first diffusion seed (PyMol can open.gzfiles directly)5vht_from_json_model_1.cif.gz- zipped model prediction for the second diffusion seed- ...
metrics.csv should be >0.8 (even without an MSA); if not, there may be something wrong with your setup.
Viewing the Predicted Structure(s)
Use the following code to view the predicted structures with AtomWorks framework from:from atomworks.io.utils.visualize import view
from atomworks.io import parse
# View in atomworks (or PyMol, etc.)
out = parse("path/to/prediction.cif.gz")
atom_array = out["assemblies"]["1"][0]
# (If in a notebook)
view(atom_array)
- View in PyMol like normal, or using
pymol_remote - Use the
view_pymol()function for direct PyMol integration
More examples for other usecases
Folding with only the protein sequence
{
"name": "example_protein",
"components": [
{
"seq": "AINRLQLVATLVEREVRYTPAGVPIVNCLLSYSGQAEAQAARQVEFSIEALGAGKASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
"chain_id": "A"
}
]
}
{
"name": "3en2_from_json",
"components": [
{
"seq": "AINRLQLVATLVEREV(MSE)RYTPAGVPIVNCLLSYSGQA(MSE)EAQAARQVEFSIEALGAGK(MSE)ASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
"chain_id": "A"
},
{
"seq": "AINRLQLVATLVEREV(MSE)RYTPAGVPIVNCLLSYSGQA(MSE)EAQAARQVEFSIEALGAGK(MSE)ASVLDRIAPGTVLECVGFLARKHRSSKALVFHISGLEHHHHHH",
"chain_id": "B"
}
]
}
For Stand-alone Installation, Setup, and a Basic Prediction, please refer to our github repository: https://github.com/RosettaCommons/modelforge
Folding Many Inputs
When running multiple predictions, you'll notice that the startup cost (importing libraries, loading models, initializing CUDA) often takes significantly longer than the actual prediction itself. Instead of running separate commands, you can process multiple inputs in a single command to amortize the startup cost. Multiple inputs can be provided in three ways: Within a single JSON file - Define multiple examples in one configuration1️⃣ Single JSON with Multiple Examples
📝 Example JSON configuration (full example found athttps://github.com/RosettaCommons/modelforge/tree/production/docs/rf3/examples/multiple_example_from_json.json)
[
{
"name": "multiple_examples_from_json(1)",
"components": [
{
"seq": "MNAKEIVVHALRLLENGDARGWCDLFHPEGVLEYPYPPPGYKTRFEGRETIWAHMRLFPEYMTIRFTDVQFYETADPDLAIGEFHGDGVHTVSGGKLAADYISVLRTRDGQILLYRLFFNPLRVLEPLGLEHHHHHH",
"chain_id": "A"
},
{
"smiles": "O=C1OCC(=C1)C5C4(C(O)CC3C(CCC2CC(O)CCC23C)C4(O)CC5)C"
}
]
},
{
"name": "multiple_examples_from_json(2)",
"components": [
{
"seq": "GSGVSLGQALLILSVAALLGTTVEEAVKRALWLKTKLGVSLEQAARTLSVAAYLGTTVEEAVKRALKLKTKLGVSLEQALLILFAAAALGTTVEEAVKRALKLKTKLGVSLEQALLILWTAVELGTTVEEAVKRALKLKTKLGVSLGQAQAILVVAAELGTTVEEAVYRALKLKTKLGVSLGQALLILEVAAKLGTTVEEAVKRALKLTTKLG",
"chain_id": "A"
},
{
"ccd_code": "MG"
}
]
}
]
Responsible AI Considerations
All datasets used in this model's development are publicly available and do not contain any PHI or PII data or any other sensitive personal data.RosettaFold3 (RF3) was evaluated against state-of-the-art recent models: AlphaFold3 and Boltz-1/2 on the tasks of protein-protein interaction prediction, protein-ligand interaction prediction on held out test sets from the PDB. Below are the results on these two benchmarks. Further benchmark performance and details of the evaluation can be found in the paper at: https://www.biorxiv.org/content/10.1101/2025.08.14.670328
The source code for this model is available on the github repository at:
https://github.com/RosettaCommons/modelforge
https://github.com/RosettaCommons/modelforge
| Category | Benchmark | RosettaFold3 (RF3) | Boltz-1x | Boltz-2 | AlphaFold3 |
|---|---|---|---|---|---|
| Interface lDDT | Protein-Protein interaction | 0.61 | 0.53 | 0.55 | 0.66 |
| Interface lDDT | Protein-Ligand interaction | 0.80 | 0.67 | 0.75 | 0.86 |
| Chiral center accuracy | Protein Data Bank (PDB) test set | 83 % | -na- | 78 % | 80 % |
Model Specifications
LicenseCustom
Last UpdatedNovember 2025
Input TypeText
Output TypeText
ProviderUW Institute for Protein Design
Languages1 Language