|
|
|
MPNet |
|
Overview |
|
The MPNet model was proposed in MPNet: Masked and Permuted Pre-training for Language Understanding by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu. |
|
MPNet adopts a novel pre-training method, named masked and permuted language modeling, to inherit the advantages of |
|
masked language modeling and permuted language modeling for natural language understanding. |
|
The abstract from the paper is the following: |
|
BERT adopts masked language modeling (MLM) for pre-training and is one of the most successful pre-training models. |
|
Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for |
|
pre-training to address this problem. However, XLNet does not leverage the full position information of a sentence and |
|
thus suffers from position discrepancy between pre-training and fine-tuning. In this paper, we propose MPNet, a novel |
|
pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. MPNet leverages the |
|
dependency among predicted tokens through permuted language modeling (vs. MLM in BERT), and takes auxiliary position |
|
information as input to make the model see a full sentence and thus reducing the position discrepancy (vs. PLM in |
|
XLNet). We pre-train MPNet on a large-scale dataset (over 160GB text corpora) and fine-tune on a variety of |
|
down-streaming tasks (GLUE, SQuAD, etc). Experimental results show that MPNet outperforms MLM and PLM by a large |
|
margin, and achieves better results on these tasks compared with previous state-of-the-art pre-trained methods (e.g., |
|
BERT, XLNet, RoBERTa) under the same model setting. |
|
The original code can be found here. |
|
Usage tips |
|
MPNet doesn't have token_type_ids, you don't need to indicate which token belongs to which segment. Just |
|
separate your segments with the separation token tokenizer.sep_token (or [sep]). |
|
Resources |
|
|
|
Text classification task guide |
|
Token classification task guide |
|
Question answering task guide |
|
Masked language modeling task guide |
|
Multiple choice task guide |
|
|
|
MPNetConfig |
|
[[autodoc]] MPNetConfig |
|
MPNetTokenizer |
|
[[autodoc]] MPNetTokenizer |
|
- build_inputs_with_special_tokens |
|
- get_special_tokens_mask |
|
- create_token_type_ids_from_sequences |
|
- save_vocabulary |
|
MPNetTokenizerFast |
|
[[autodoc]] MPNetTokenizerFast |
|
|
|
MPNetModel |
|
[[autodoc]] MPNetModel |
|
- forward |
|
MPNetForMaskedLM |
|
[[autodoc]] MPNetForMaskedLM |
|
- forward |
|
MPNetForSequenceClassification |
|
[[autodoc]] MPNetForSequenceClassification |
|
- forward |
|
MPNetForMultipleChoice |
|
[[autodoc]] MPNetForMultipleChoice |
|
- forward |
|
MPNetForTokenClassification |
|
[[autodoc]] MPNetForTokenClassification |
|
- forward |
|
MPNetForQuestionAnswering |
|
[[autodoc]] MPNetForQuestionAnswering |
|
- forward |
|
|
|
TFMPNetModel |
|
[[autodoc]] TFMPNetModel |
|
- call |
|
TFMPNetForMaskedLM |
|
[[autodoc]] TFMPNetForMaskedLM |
|
- call |
|
TFMPNetForSequenceClassification |
|
[[autodoc]] TFMPNetForSequenceClassification |
|
- call |
|
TFMPNetForMultipleChoice |
|
[[autodoc]] TFMPNetForMultipleChoice |
|
- call |
|
TFMPNetForTokenClassification |
|
[[autodoc]] TFMPNetForTokenClassification |
|
- call |
|
TFMPNetForQuestionAnswering |
|
[[autodoc]] TFMPNetForQuestionAnswering |
|
- call |
|
|
|
|