t5-base
Version: 18
Last updated April 2025
The developers of the Text-To-Text Transfer Transformer (T5) write :
With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task.
T5-Base is the checkpoint with 220 million parameters.

Training Details

Training Data

The model is pre-trained on the Colossal Clean Crawled Corpus (C4) , which was developed and released in the context of the same research paper as T5. The model was pre-trained on a on a multi-task mixture of unsupervised and supervised tasks.
Thereby, the following datasets were being used for:

Datasets used for Unsupervised denoising objective:

Datasets used for Supervised text-to-text language modeling objective

Training Procedure

In their abstract , the model developers write:
In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks.
The framework introduced, the T5 framework, involves a training procedure that brings together the approaches studied in the paper. See the research paper for further details.

Evaluation Results

For full results for T5-Base, see the research paper , Table 14.

Testing Data, Factors & Metrics

The developers evaluated the model on 24 tasks, see the research paper for full details.

Model Evaluation samples

TaskUse caseDatasetPython sample (Notebook)CLI with YAML
TranslationTranslationwmt16/ro-en evaluate-model-translation.ipynb evaluate-model-translation.yml

Inference samples

Inference typePython sample (Notebook)
Real timesdk-example.ipynb
Real timetext-translation-online-endpoint.ipynb

Sample inputs and outputs

Sample input

{
    "input_data": [
        "translate English to French: Life is so beautiful, once you learn how to live with it",
        "translate English to German: Berlin is the capital of Germany"
    ]
}

Sample output

[
  "La vie est si belle, une fois que vous apprenez à la vivre",
  "Berlin ist die Hauptstadt Deutschlands"
]
Model Specifications
LicenseApache-2.0
Last UpdatedApril 2025
Publisher
Languages4 Languages