Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
- Use the memory layout (self.num_heads, 3, self.head_dim) instead of (3, self.num_heads, self.head_dim) for the QKV tensor with MHA.