Reward Model - a rhd008 Collection

rhd008 's Collections

Reward Model

updated Mar 31

Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13
LaMDA: Language Models for Dialog Applications

Paper • 2201.08239 • Published Jan 20, 2022 • 4