lc comments on How do you feel about LessWrong these days? [Open feedback thread]

lc 10 Dec 2023 23:29 UTC
15 points
11

Deep reinforcement learning agents will not come to intrinsically and primarily value their reward signal; reward is not the trained agent’s optimization target.

I have no stake in this debate, but how is this particular point any different than what Eliezer says when he makes the point about humans not optimizing for IGF? I think the entire mesaoptimization concern is built around this premise, no?