Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame contribute delete
147 Bytes
The output from the decoder is passed to a language modeling head, which performs a linear transformation to convert the hidden states into logits.