adamShimi comments on [AN #132]: Complex and subtly incorrect arguments as an obstacle to debate

adamShimi 10 Jan 2021 15:36 UTC
LW: 4 AF: 3
AF
As always, thanks for everyone involved in the newsletter!
The Understanding Learned Reward Functions paper looks great, both in terms of studying inner alignment (the version with goal-directed/RL policies instead of mesa-optimizers) and for thinking about goal-directedness.