To capture knowledge in a more modular and interpretable way, we | |
augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend | |
over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. |