Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
If your hardware is not compatible with Flash Attention 2, you can still benefit from attention kernel optimisations through Better Transformer support covered above.