distilbert-base-uncased-distilled-squad
Version: 13
DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT , and the paper DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter . DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.
This model is a fine-tune checkpoint of DistilBERT-base-uncased , fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1 .
Training Details
Training Data
The distilbert-base-uncased model model describes it's training data as:DistilBERT pretrained on the same data as BERT, which is BookCorpus , a dataset consisting of 11,038 unpublished books and English Wikipedia (excluding lists, tables and headers).To learn more about the SQuAD v1.1 dataset, see the SQuAD v1.1 data card .
Training Procedure
Preprocessing
See the distilbert-base-uncased model card for further details.Pretraining
See the distilbert-base-uncased model card for further details.Evaluation Results
As discussed in the model repositoryThis model reaches a F1 score of 86.9 on the [SQuAD v1.1] dev set (for comparison, Bert bert-base-uncased version reaches a F1 score of 88.5).
Limitations and Biases
Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021) ). Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.Model Evaluation samples
| Task | Use case | Dataset | Python sample (Notebook) | CLI with YAML |
|---|---|---|---|---|
| Question Answering | Extractive Q&A | Squad v2 | evaluate-model-question-answering.ipynb | evaluate-model-question-answering.yml |
Inference samples
| Inference type | Python sample (Notebook) |
|---|---|
| Real time | sdk-example.ipynb |
| Real time | question-answering-online-endpoint.ipynb |
Sample inputs and outputs
Sample input
{
"input_data": {
"question": "What's my name?",
"context": "My name is John and I live in Seattle"
}
}
Sample output
[
"John"
]
Model Specifications
LicenseApache-2.0
Last UpdatedApril 2025
Provider
Languages1 Language