meta-llama/Llama-4-Scout-17B-16E-Instruct Image-Text-to-Text • Updated 15 days ago • 753k • • 821
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 15 days ago • 622k • 1.32k
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 155
Running 2.51k 2.51k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
NanoFlow: Towards Optimal Large Language Model Serving Throughput Paper • 2408.12757 • Published Aug 22, 2024 • 18
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8, 2024 • 163
Inference Performance Optimization for Large Language Models on CPUs Paper • 2407.07304 • Published Jul 10, 2024 • 54