Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

291 Bytes

	Implement those changes which often means changing the self-attention layer, the order of the normalization
	layer, etc… Again, it is often useful to look at the similar architecture of already existing models in Transformers to
	get a better feeling of how your model should be implemented.