Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 291 Bytes

5fa1a76

Implement those changes which often means changing the self-attention layer, the order of the normalization
layer, etc… Again, it is often useful to look at the similar architecture of already existing models in Transformers to
get a better feeling of how your model should be implemented.