L1 Collection L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 2 items • Updated Mar 7 • 5