Suppose the AI is addressing a letter containing $1,000,000. It can address this to Jane Brown, or to John Smith. Once addressed, AI will be turned off, and the letter will be posted.
A utility uB that values Jane Brown would like the letter addressed to her, and vice versa for a utility uS that values John Smith. These two utilities differ only on the action the AI takes, not on subsequent observations. Therefore “This implies that by choosing a, the agent expects to observe some uA-high scoring oA with greater probability than if it had selected ∅” is false—it need not expect to observe anything at all.
However the theorem is still true, because we just need to consider utilities that differ on actions—such as uB and uS.
Suppose the AI is addressing a letter containing $1,000,000. It can address this to Jane Brown, or to John Smith. Once addressed, AI will be turned off, and the letter will be posted.
A utility uB that values Jane Brown would like the letter addressed to her, and vice versa for a utility uS that values John Smith. These two utilities differ only on the action the AI takes, not on subsequent observations. Therefore “This implies that by choosing a, the agent expects to observe some uA-high scoring oA with greater probability than if it had selected ∅” is false—it need not expect to observe anything at all.
However the theorem is still true, because we just need to consider utilities that differ on actions—such as uB and uS.