Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
As the Perceiver's input
length will not have an impact on the computation time of the self-attention layers, one can provide raw bytes,
providing inputs of length 2048 to the model.