I wrote that post you link to, and I don’t think ambitious value learning is doomed at all—just that we can’t do it the way we traditionally attempt to.
I specifically mean ambitious value learning using IRL. The resulting algorithm will look quite different from IRL as it currently exists. (In particular, assuming humans are reinforcement learners is problematic)
I wrote that post you link to, and I don’t think ambitious value learning is doomed at all—just that we can’t do it the way we traditionally attempt to.
I specifically mean ambitious value learning using IRL. The resulting algorithm will look quite different from IRL as it currently exists. (In particular, assuming humans are reinforcement learners is problematic)