Efficient Generative Model Training via Embedded Representation Warmup Paper • 2504.10188 • Published 10 days ago • 12
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers Paper • 2501.08303 • Published Jan 14
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance Paper • 2502.18772 • Published Feb 26 • 34
Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit? Paper • 2309.06891 • Published Sep 13, 2023 • 2