Stuart_Armstrong comments on Towards a New Impact Measure

Stuart_Armstrong 21 Sep 2018 15:52 UTC
LW: 3 AF: 2
AF

Can you expand?

Suppose the AI is addressing a letter containing $1,000,000. It can address this to Jane Brown, or to John Smith. Once addressed, AI will be turned off, and the letter will be posted.

A utility $u_{B}$ that values Jane Brown would like the letter addressed to her, and vice versa for a utility $u_{S}$ that values John Smith. These two utilities differ only on the action the AI takes, not on subsequent observations. Therefore “This implies that by choosing $a$ , the agent expects to observe some $u_{A}$ -high scoring $o_{A}$ with greater probability than if it had selected $\emptyset$ ” is false—it need not expect to observe anything at all.

However the theorem is still true, because we just need to consider utilities that differ on actions—such as $u_{B}$ and $u_{S}$ .