If there is more than one utility function that it could end up maximizing, then it is not an expected utility maximizer, because any particular utility function is better maximized by maximizing it directly than by possibly maximizing some other utility function depending on certain circumstances. As an example, suppose you could end up using one of two utility functions: u and v, there are three possible outcomes: X, Y, and Z, and u(X)>u(Y) while v(X)<v(Y). Consider two possible circumstances:
1) You get to choose between X and Y.
2) You get to choose between the lotteries .5X+.5Z and .5Y+.5Z.
If you would end up using u if (1) happens but end up using v if (2) happens, then you violate the independence axiom.
Relate this to value loading. If the programmer says cake, you value cake; if they say death, you value death. You could see this as choosing between two utilities, or you could see it as having a single utility function where “what the programmer says” strongly distinguishes between otherwise identical universes.
If there is more than one utility function that it could end up maximizing, then it is not an expected utility maximizer, because any particular utility function is better maximized by maximizing it directly than by possibly maximizing some other utility function depending on certain circumstances. As an example, suppose you could end up using one of two utility functions: u and v, there are three possible outcomes: X, Y, and Z, and u(X)>u(Y) while v(X)<v(Y). Consider two possible circumstances: 1) You get to choose between X and Y. 2) You get to choose between the lotteries .5X+.5Z and .5Y+.5Z.
If you would end up using u if (1) happens but end up using v if (2) happens, then you violate the independence axiom.
Here’s a better proof of the existence of unlosing agents: http://lesswrong.com/r/discussion/lw/knv/model_of_unlosing_agents/
Relate this to value loading. If the programmer says cake, you value cake; if they say death, you value death. You could see this as choosing between two utilities, or you could see it as having a single utility function where “what the programmer says” strongly distinguishes between otherwise identical universes.