t5-base

Version: 18
The developers of the Text-To-Text Transfer Transformer (T5) write :
With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task.
T5-Base is the checkpoint with 220 million parameters.

Training Data

The model is pre-trained on the Colossal Clean Crawled Corpus (C4) , which was developed and released in the context of the same research paper as T5. The model was pre-trained on a on a multi-task mixture of unsupervised and supervised tasks.
Thereby, the following datasets were being used for:

Datasets used for Unsupervised denoising objective:

Datasets used for Supervised text-to-text language modeling objective

Training Procedure

In their abstract , the model developers write:
In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks.
The framework introduced, the T5 framework, involves a training procedure that brings together the approaches studied in the paper. See the research paper for further details.
For full results for T5-Base, see the research paper , Table 14.

Testing Data, Factors & Metrics

The developers evaluated the model on 24 tasks, see the research paper for full details.

Quick facts

Model provider
TypeText translation
LifecycleGenerally available (GA)