S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published Feb 18 • 29 • 2