Just now saw this very thoughtful review. I share a lot of your perspective, especially:
I think there are people who think that reward is the optimization target by definition or by design, as opposed to this being a highly non-trivial claim that needs to be argued for. It’s the former view that this post (correctly) argues against.
and
Looking back at the post, I felt some amount of “why are you meandering around instead of just saying the Thing?”, with the immediate next thought being “well, it’s hard to say the Thing”. Indeed, I do not know how to say it better.
Just now saw this very thoughtful review. I share a lot of your perspective, especially:
and