Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame contribute delete
195 Bytes
Using our proposed efficient additive attention, we build a series of models called "SwiftFormer" which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed.