Arabic Leaderboards: Introducing Arabic Instruction Following, Updating AraGen, and More 16 days ago • 15
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks Paper • 2504.15521 • Published 2 days ago • 50
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications Paper • 2303.15446 • Published Mar 27, 2023 • 1
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications Paper • 2206.10589 • Published Jun 21, 2022
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Paper • 2406.09418 • Published Jun 13, 2024
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model Paper • 2503.21782 • Published 27 days ago
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Paper • 2504.13180 • Published 6 days ago • 16
ArTST - Arabic Text Speech Transformer Collection Open source project for Arabic Speech Recognition and Generation • 14 items • Updated 4 days ago • 10
Language and Planning in Robotic Navigation: A Multilingual Evaluation of State-of-the-Art Models Paper • 2501.05478 • Published Jan 7