Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

533 Bytes

	The authors of Reformer: The Efficient Transformer noticed that since the
	computation is independent of the sequence_length dimension, it is mathematically equivalent to compute the output
	embeddings of both feed forward layers [batch_size, config.hidden_size]_0, , [batch_size, config.hidden_size]_n
	individually and concat them afterward to [batch_size, sequence_length, config.hidden_size] with n = sequence_length, which trades increased computation time against reduced memory use, but yields a mathematically
	equivalent result.