Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Rename reorder_and_upcast_attn->attention_softmax_in_fp32
- Cache the attention mask value to avoid recreating it every time.