Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
According to the abstract,
Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a
left-to-right decoder (like GPT).