by facebook
Open source · 177k downloads · 200 likes
BART Large is an advanced language model based on an encoder-decoder architecture, combining the strengths of a bidirectional encoder (like BERT) and an autoregressive decoder (like GPT). Pre-trained on vast English corpora, it excels particularly in text generation tasks, such as summarization, machine translation, or paraphrasing, thanks to its ability to reconstruct texts from corrupted versions. It also performs well in comprehension tasks like text classification or question answering after fine-tuning. This model stands out for its versatility, supporting both generic uses like text infilling and specialized applications through versions fine-tuned on specific datasets. Its innovative approach to pre-training through corruption and reconstruction makes it a powerful tool for addressing a wide range of challenges in natural language processing.
BART model pre-trained on English language. It was introduced in the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Lewis et al. and first released in this repository.
Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.
BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.
BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering).
You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the model hub to look for fine-tuned versions on a task that interests you.
Here is how to use this model in PyTorch:
from transformers import BartTokenizer, BartModel
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
model = BartModel.from_pretrained('facebook/bart-large')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
@article{DBLP:journals/corr/abs-1910-13461,
author = {Mike Lewis and
Yinhan Liu and
Naman Goyal and
Marjan Ghazvininejad and
Abdelrahman Mohamed and
Omer Levy and
Veselin Stoyanov and
Luke Zettlemoyer},
title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
Generation, Translation, and Comprehension},
journal = {CoRR},
volume = {abs/1910.13461},
year = {2019},
url = {http://arxiv.org/abs/1910.13461},
eprinttype = {arXiv},
eprint = {1910.13461},
timestamp = {Thu, 31 Oct 2019 14:02:26 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}