Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
The authors introduced Persimmon-8B, a decoder model based on the classic transformers architecture, with query and key normalization.