-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 54 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 57 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 47 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 25
Mei dianwen
mdw123
·
AI & ML interests
None yet
Recent Activity
updated
a collection
29 days ago
Papers
updated
a collection
29 days ago
Papers
upvoted
a
paper
about 1 month ago
A Survey on Post-training of Large Language Models
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet