Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
It, furthermore, allows each input token to
interact with all other tokens in the layer.