view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr β’ Feb 7 β’ 209
view article Article π¦Έπ»#14: What Is MCP, and Why Is Everyone β Suddenly!β Talking About It? By Kseniase β’ Mar 17 β’ 328
view article Article I trained a Language Model to schedule events with GRPO! By anakin87 β’ Apr 29 β’ 85
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper β’ 2506.13585 β’ Published Jun 16 β’ 262
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others β’ May 12 β’ 510
Running 3.1k 3.1k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
TransMLA: Multi-head Latent Attention Is All You Need Paper β’ 2502.07864 β’ Published Feb 11 β’ 57