Logan Riggs comments on A Certain Formalization of Corrigibility Is VNM-Incoherent

Logan Riggs 20 Nov 2021 12:19 UTC
LW: 1 AF: 1
AF
The agent could then manipulate whoever’s in charge of giving the “hand-of-god” optimal action.

I do think the “reducing uncertainty” captures something relevant, and turntrout’s outside view post (huh, guess I can’t make links on mobile, so here: https://www.lesswrong.com/posts/BMj6uMuyBidrdZkiD/corrigibility-as-outside-view) grounds out uncertainty to be “how wrong am I about the true reward of many different people I could be helping out?”