Rethinking Reflection in Pre-Training Collection Datasets & Artifacts related to the paper "Rethinking Reflection in Pre-Training" • 9 items • Updated 10 days ago • 4
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training Paper • 2503.18929 • Published 28 days ago • 3 • 3
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training Paper • 2503.18929 • Published 28 days ago • 3
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published 24 days ago • 43