Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 149
Recurrent Models Collection These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space. • 15 items • Updated May 21 • 10
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated 12 days ago • 218
Not All Correct Answers Are Equal: Why Your Distillation Source Matters Paper • 2505.14464 • Published May 20 • 9
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 635
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2 • 53
Skywork-Reward-V2 Collection Scaling preference data curation to the extreme • 9 items • Updated Jul 4 • 23
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 73
Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity Paper • 2505.21411 • Published May 27 • 17
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published Apr 15 • 18
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 40
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published Jun 10 • 15