Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame contribute delete
200 Bytes
Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
representations from unlabeled text by jointly conditioning on both left and right context in all layers.