TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published 6 days ago • 70
NVIDIA Nemotron Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 3 items • Updated about 13 hours ago • 54
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 150
Recurrent Models Collection These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space. • 15 items • Updated May 21 • 10