RSS

John Schulman

Karma: 483

Scal­ing Laws for Re­ward Model Overoptimization

Oct 20, 2022, 12:20 AM
103 points
13 comments1 min readLW link
(arxiv.org)

Fre­quent ar­gu­ments about alignment

John SchulmanJun 23, 2021, 12:46 AM
103 points
17 comments5 min readLW link