Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame contribute delete
192 Bytes
The decoder updates these embeddings through multiple self-attention and encoder-decoder attention layers
to output decoder_hidden_states of the same shape: (batch_size, num_queries, d_model).