NaiveTortoise comments on New safety research agenda: scalable agent alignment via reward modeling

NaiveTortoise 1 Jan 2019 0:27 UTC
LW: 6 AF: 3
AF
Thanks a lot! This definitely clears things up and also highlights the difference between recursive reward modeling and typical amplification/the expert imitation approach you mentioned.