EvoDiff
Version: 1
Key capabilities
About this model
EvoDiff can unconditionally sample diverse structurally-plausible proteins, generate intrinsically disordered regions, and scaffold structural motifs using only sequence information, challenging a paradigm in structure-based protein design.Key model capabilities
Below are several use cases for EvoDiff. Currently, Azure AI Foundry supports unconditional or conditional design with EvoDiff-Seq. To use EvoDiff-MSA, we point you to our github repository for more information.- Unconditional generation with EvoDiff-Seq or EvoDiff-MSA(https://github.com/microsoft/evodiff/blob/main/README.md#unconditional-generation-with-evodiff-msa )
- Conditional sequence generation
- Evolution-guided protein generation with EvoDiff-MSA
- Generating intrinsically disordered regions with EvoDiff-Seq and EvoDiff-MSA
- Scaffolding functional motifs with EvoDiff-Seq and EvoDiff-MSA
Use cases
See Responsible AI for additional considerations for responsible use.Key use cases
Below are several use cases for EvoDiff. Currently, Azure AI Foundry supports unconditional or conditional design with EvoDiff-Seq. To use EvoDiff-MSA, we point you to our github repository for more information.Out of scope use cases
This model is intended for use on protein sequences. It is not meant for natural language or other biological sequences, such as DNA sequences. This model will not generate sequences that are not proteins. This includes cases such as trying to generate other biological sequences, such as DNA sequences, or natural language. In other words, the model will perform best on data within the data distribution, which includes protein sequences and multiple sequence alignments (MSAs).Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Technical specs
We trained all EvoDiff sequence models on 42M sequences from UniRef50 using a dilated convolutional neural network architecture introduced in the CARP protein masked language model. We trained 38M-parameter and 640M-parameter versions for each forward corruption scheme and for left-to-right autoregressive (LRAR) decoding.Training cut-off date
The provider has not supplied this information.Training time
- Hardware Type:
32GB NVIDIA V100GPUs - Hours used: 4,128 (14 days per sequence model, 10 days per MSA model)
- Cloud Provider: Azure
- Compute Region: East US
- Carbon Emitted: 485.21 kg
Input formats
The provider has not supplied this information.Output formats
The provider has not supplied this information.Supported languages
The provider has not supplied this information.Sample JSON response
The provider has not supplied this information.Model architecture
We investigated two types of forward processes for diffusion over discrete data modalities to determine which would be most effective. In order-agnostic autoregressive diffusion OADM , one amino acid is converted to a special mask token at each step in the forward process. After $T=L$ steps, where $L$ is the length of the sequence, the entire sequence is masked. We additionally designed discrete denoising diffusion probabilistic models D3PM for protein sequences. In EvoDiff-D3PM, the forward process corrupts sequences by sampling mutations according to a transition matrix, such that after $T$ steps the sequence is indistinguishable from a uniform sample over the amino acids. In the reverse process for both, a neural network model is trained to undo the previous corruption. The trained model can then generate new sequences starting from sequences of masked tokens or of uniformly-sampled amino acids for EvoDiff-OADM or EvoDiff-D3PM, respectively.Long context
The provider has not supplied this information.Optimizing model performance
The provider has not supplied this information.Additional assets
For all other models in the EvoDiff suite, please see our github repository . We provide all generated sequences on the EvoDiff Zenodo .Training disclosure
Training, testing and validation
We obtain sequences from the Uniref50 dataset , which contains approximately 42 million protein sequences. The Multiple Sequence Alignments (MSAs) are from the OpenFold dataset , which contains 401,381 MSAs for 140,000 unique Protein Data Bank (PDB) chains and 16,000,000 UniClust30 clusters. The intrinsically disordered regions (IDR) data was obtained from the Reverse Homology GitHub . For the scaffolding structural motif benchmark, we provide pdb and fasta files used for conditionally generating sequences in the examples/scaffolding-pdbs folder. We also provide pdb files used for conditionally generating MSAs in the examples/scaffolding-msas folder.Distribution
Distribution channels
The provider has not supplied this information.More information
Start using EvoDiff on Azure AI Foundry with this Jupyter Notebook . For full details, please refer to our preprint .Responsible AI considerations
Safety techniques
The provider has not supplied this information.Safety evaluations
The provider has not supplied this information.Known limitations
This model will not generate sequences that are not proteins. This includes cases such as trying to generate other biological sequences, such as DNA sequences, or natural language. In other words, the model will perform best on data within the data distribution, which includes protein sequences and multiple sequence alignments (MSAs). Based on review of currently available information, EvoDiff is not be expected to provide any notable uplift in expertise to users. It is also very unlikely to create any new or add to any known CBRN or advanced autonomy risks.Acceptable use
Acceptable use policy
This model is intended for use on protein sequences. It is not meant for natural language or other biological sequences, such as DNA sequences.Primary Use Cases
Below are several use cases for EvoDiff. Currently, Azure AI Foundry supports unconditional or conditional design with EvoDiff-Seq. To use EvoDiff-MSA, we point you to our github repository for more information.- Unconditional generation with EvoDiff-Seq or EvoDiff-MSA(https://github.com/microsoft/evodiff/blob/main/README.md#unconditional-generation-with-evodiff-msa )
- Conditional sequence generation
- Evolution-guided protein generation with EvoDiff-MSA
- Generating intrinsically disordered regions with EvoDiff-Seq and EvoDiff-MSA
- Scaffolding functional motifs with EvoDiff-Seq and EvoDiff-MSA
Out-of-Scope Use Cases
This model is intended for use on protein sequences. It is not meant for natural language or other biological sequences, such as DNA sequences.Quality and performance evaluations
Source: MicrosoftEvoDiff-Seq Performance
The reconstruction KL (Recon KL) was calculated between the distribution of amino acids in the test set and in generated samples (n=1000). The perplexity was computed on 25k samples from the test set. The minimum Hamming distance to any train sequence of the same length (Hamming) is reported for each model as the mean ± standard deviation over the generated samples. Frechet ProtT5 distance (FPD) was calculated between the test set and generated samples. The secondary structure KL (SS KL) was calculated between the means of the predicted secondary structures of the test and generated samples.| Model | parameters | Recon KL | perplexity | Hamming | FPD | SS KL |
|---|---|---|---|---|---|---|
| Test | - | 9.92e-41 | - | 0.00392 | 0.101 | 1.37e-51 |
| EvoDiff-Seq (D3PM BLOSUM) | 38M | 1.77e-2 | 17.16 | 0.83 ± 0.05 | 1.42 | 3.30e-5 |
| EvoDiff-Seq (D3PM Uniform) | 38M | 1.48e-3 | 18.82 | 0.83 ± 0.05 | 1.31 | 3.73e-5 |
| EvoDiff-Seq (OADM) | 38M | 1.11e-3 | 14.61 | 0.83 ± 0.07 | 0.92 | 1.61e-4 |
| EvoDiff-Seq (D3PM BLOSUM) | 640M | 3.73e-2 | 15.74 | 0.83 ± 0.05 | 1.53 | 4.96e-4 |
| EvoDiff-Seq (D3PM Uniform) | 640M | 2.90e-3 | 18.47 | 0.83 ± 0.05 | 1.35 | 2.13e-4 |
| EvoDiff-Seq (OADM) | 640M | 1.26e-3 | 13.05 | 0.83 ± 0.08 | 0.88 | 1.48e-4 |
| LRAR | 38M | 7.90e-4 | 12.38 | 0.82 ± 0.06 | 0.86 | 1.61e-4 |
| CARP | 38M | 5.71e-1 | 25.13 | 0.74 ± 0.07 | 6.30 | 2.72e-3 |
| LRAR | 640M | 7.01e-4 | 10.41 | 0.83 ± 0.06 | 0.63 | 1.76e-5 |
| CARP | 640M | 3.56e-1 | 31.77 | 0.84 ± 0.05 | 1.78 | 5.03e-3 |
| ESM-1b3 | 650M | 4.91e-1 | 53.49 | 0.83 ± 0.06 | 6.67 | 5.48e-4 |
| ESM-23 | 650M | 5.00e-1 | 68.39 | 0.84 ± 0.06 | 6.79 | 3.05e-3 |
| FoldingDiff4 | 14M | 5.49e-2 | - | - | 1.64 | 1.76e-3 |
| RFdiffusion5 | 60M | 7.19e-2 | - | - | 1.96 | 5.98e-3 |
| Random | - | 1.65e-1 | 20 | 0.85 ± 0.04 | 3.16 | 1.90e-4 |
- Calculated between the test set and validation set.
- Reported value is the minimum Hamming distance between any two natural sequences of the same length in UniRef50.
- Due to model constraints, the maximum sequence length sampled was 1022.
- For the FoldingDiff baseline, 1000 structures generated by FoldingDiff were randomly selected, and the corresponding 1000 inferred sequences were inverse-folded using ESM IF. These sequences are between lengths of 50 and 128 residues.
- For the RFdiffusion baseline,1000 structures were generated corresponding to the UniRef train distribution length, and 1000 corresponding sequences were inverse-folded using ESM-IF.
EvoDiff-MSA performance
The perplexity is calculated based on the ability of each model to reconstruct a subsampled MSA from the validation set. "Max" and "Rand. Perplexity" indicate MaxHamming and Random subsampling, respectively, for construction of the validation MSA.| Corruption | Subsampling | Params | MaxPerplexity | Rand.Perplexity |
|---|---|---|---|---|
| EvoDiff-MSA (D3PM BLOSUM) | Random | 100M | 11.35 | 8.31 |
| EvoDiff-MSA (D3PM BLOSUM) | Max | 100M | 10.98 | 7.61 |
| EvoDiff-MSA (D3PM Uniform) | Random | 100M | 10.14 | 6.77 |
| EvoDiff-MSA (D3PM Uniform) | Max | 100M | 10.06 | 6.66 |
| EvoDiff-MSA (OADM) | Random | 100M | 6.05 | 3.64 |
| EvoDiff-MSA (OADM) | Max | 100M | 6.14 | 3.60 |
| ESM-MSA-1b | Max | 100M | 11.20 | 5.89 |
EvoDiff-Seq structural plausibility metrics
Metrics are reported as the mean ± standard deviation for 1000 generated samples for each model.| Model | Params | ESM-IF scPerplexity | ProteinMPNN scPerplexity | OmegaFold pLDDT |
|---|---|---|---|---|
| Test | - | 8.04 ± 4.04 | 3.09 ± 0.63 | 68.25 ± 17.85 |
| EvoDiff-Seq (D3PM BLOSUM) | 38M | 12.38 ± 2.06 | 3.80 ± 0.49 | 42.76 ± 14.55 |
| EvoDiff-Seq (D3PM Uniform) | 38M | 12.03 ± 2.04 | 3.77 ± 0.50 | 42.37 ± 14.39 |
| EvoDiff-Seq (OADM) | 38M | 11.61 ± 2.38 | 3.72 ± 0.50 | 43.78 ± 14.18 |
| EvoDiff-Seq (D3PM BLOSUM) | 640M | 11.86 ± 2.21 | 3.73 ± 0.48 | 44.14 ± 13.80 |
| EvoDiff-Seq (D3PM Uniform) | 640M | 12.29 ± 2.05 | 3.78 ± 0.49 | 41.65 ± 14.32 |
| EvoDiff-Seq (OADM) | 640M | 11.53 ± 2.50 | 3.71 ± 0.52 | 44.46 ± 14.62 |
| LRAR | 38M | 11.61 ± 2.38 | 3.64 ± 0.56 | 48.26 ± 14.87 |
| CARP | 38M | 9.68 ± 2.56 | 3.66 ± 0.62 | 50.79 ± 12.06 |
| LRAR | 640M | 10.99 ± 2.63 | 3.59 ± 0.54 | 48.71 ± 15.47 |
| CARP | 640M | 14.13 ± 2.42 | 4.05 ± 0.52 | 41.56 ± 14.35 |
| ESM-1b | 650M | 13.90 ± 2.44 | 3.47 ± 0.68 | 58.07 ± 15.64 |
| ESM-2 | 650M | 14.02 ± 2.87 | 3.58 ± 0.69 | 50.70 ± 15.67 |
| Random | - | 14.68 ± 1.97 | 3.96 ± 0.50 | 39.97 ± 14.05 |
EvoDiff-MSA homolog conditioned generation
Metrics are reported as the mean ± standard deviation over 250 generated samples for each model. The first subsampling method listed describes the sampling procedure to train the model, and the second describes the subsampling procedure used for generation.| Model | scPerplexity | pLDDT | Seq. similarity | TM score |
|---|---|---|---|---|
| Valid | 5.93 ± 3.19 | 73.99 ± 17.80 | 14.58 ± 21.641 | - |
| EvoDiff-MSA (OADM (Rand) - Rand MSA) | 9.41 ± 2.61 | 55.99 ± 14.75 | 6.13 ± 9.88 | 0.49 ± 0.23 |
| EvoDiff-MSA (OADM (Max) - Max MSA) | 9.38 ± 2.57 | 57.08 ± 16.01 | 6.74 ± 11.00 | 0.50 ± 0.23 |
| EvoDiff-MSA (OADM (Max) - Rand MSA) | 9.59 ± 2.69 | 54.95 ± 16.83 | 6.55 ± 10.49 | 0.46 ± 0.23 |
| ESM-MSA-1b | 10.05 ± 2.92 | 51.64 ± 16.54 | 7.13 ± 11.60 | 0.40 ± 0.23 |
| Potts | 10.34 ± 2.26 | 55.46 ± 13.82 | 12.01 ± 17.19 | 0.17 ± 0.10 |
- Sequence similarity is calculated between the original query sequence and all the sequences in the MSA.
Scaffolding performance of EvoDiff-Seq
Number of scaffolding successes out of 100 generations for RFdiffusion, EvoDiff-Seq, the LRAR baseline, the CARP baseline, and randomly sampled scaffolds (Random), for each of 17 scaffolding problems. The bottom row contains the total number of successful scaffolds generated per model.| PDB | RFdiffusion | EvoDiff-Seq | LRAR | CARP | Random |
|---|---|---|---|---|---|
| 1BCF | 100 | 24 | 0 | 4 | 0 |
| 6E6R | 71 | 16 | 7 | 3 | 1 |
| 2KL8 | 88 | 0 | 1 | 1 | 0 |
| 6EXZ | 42 | 0 | 0 | 0 | 0 |
| 1YCR | 74 | 13 | 12 | 10 | 7 |
| 6VW1 | 69 | 1 | 0 | 0 | 0 |
| 4JHW | 0 | 0 | 0 | 0 | 0 |
| 5TPN | 61 | 0 | 0 | 0 | 0 |
| 4ZYP | 40 | 0 | 0 | 0 | 0 |
| 3IXT | 25 | 23 | 22 | 13 | 7 |
| 7MRX | 7 | 0 | 0 | 0 | 0 |
| 1PRW | 8 | 68 | 70 | 54 | 5 |
| 5IUS | 2 | 0 | 0 | 0 | 0 |
| 5YUI | 0 | 4 | 0 | 0 | 0 |
| 5WN9 | 0 | 0 | 0 | 0 | 2 |
| 1QJG | 0 | 0 | 0 | 0 | 0 |
| 5TRV | 22 | 0 | 0 | 0 | 0 |
| Total | 610 | 149 | 112 | 85 | 22 |
Scaffolding performance of EvoDiff-MSA
Number of scaffolding successes out of 100 generations for RFdiffusion, EvoDiff-MSA (Max), EvoDiff-MSA (Random), and the ESM-MSA baseline, for each of 17 scaffolding problems. The bottom row contains the total number of successful scaffolds generated per model.| PDB | RFdiffusion | EvoDiff-MSA (Max) | EvoDiff-MSA (Random) | ESM-MSA |
|---|---|---|---|---|
| 1BCF | 100 | 100 | 98 | 99 |
| 6E6R | 71 | 87 | 63 | 96 |
| 2KL8 | 88 | 11 | 31 | 42 |
| 6EXZ | 42 | 86 | 87 | 73 |
| 1YCR | 74 | 3 | 0 | 0 |
| 6VW1 | 69 | 4 | 3 | 4 |
| 4JHW | 0 | 0 | 0 | 0 |
| 5TPN | 61 | 0 | 0 | 0 |
| 4ZYP | 40 | 0 | 0 | 0 |
| 3IXT | 25 |
Model Specifications
LicenseMit
Last UpdatedAugust 2025
Input TypeText
Output TypeText
ProviderMicrosoft
Languages1 Language