Consider Problem 3: Omega presents you with two boxes, one of which contains $100, and says that it just ran a simulation of you in the present situation and put the money in the box the simulation didn’t choose.
This is a standard diagonal construction, where the environment is set up so that you are punished for the actions you choose, and rewarded for those of don’t choose, irrespective of the actions. This doesn’t depend on the decision algorithm you’re implementing. A possible escape strategy is to make yourself unpredictable to the environment. The difficulty would also go away if the thing being predicted wasn’t you, but something else you could predict as well (like a different agent that doesn’t simulate you).
The correct solution to this problem is to choose each box with equal probability; this problem is the reason why decision theories have to be non-deterministic. It comes up all the time in real life: I try and guess what safe combination you chose, try that combination, and if it works I take all your money. Or I try to guess what escape route you’ll use and post all the guards there.
What’s interesting about Problem 2 is that it makes what would be the normal game-theoretic strategy unstable by choosing deterministically where the probabilities are exactly equal.
this problem is the reason why decision theories have to be non-deterministic. It comes up all the time in real life: I try and guess what safe combination you chose, try that combination, and if it works I take all your money.
Of course, you can just set up the thought experiment with the proviso that “be unpredictable” is not a possible move—in fact that’s the whole point of Omega in these sorts of problems. If Omega’s trying to break into your safe, he takes your money. In Nesov’s problem, if you can’t make yourself unpredictable, then you win nothing—it’s not even worth your time to open the box. In both cases, a TDT agent does strictly as well as it possibly could—the fact that there’s $100 somewhere in the vicinity doesn’t change that.
Consider Problem 3: Omega presents you with two boxes, one of which contains $100, and says that it just ran a simulation of you in the present situation and put the money in the box the simulation didn’t choose.
This is a standard diagonal construction, where the environment is set up so that you are punished for the actions you choose, and rewarded for those of don’t choose, irrespective of the actions. This doesn’t depend on the decision algorithm you’re implementing. A possible escape strategy is to make yourself unpredictable to the environment. The difficulty would also go away if the thing being predicted wasn’t you, but something else you could predict as well (like a different agent that doesn’t simulate you).
The correct solution to this problem is to choose each box with equal probability; this problem is the reason why decision theories have to be non-deterministic. It comes up all the time in real life: I try and guess what safe combination you chose, try that combination, and if it works I take all your money. Or I try to guess what escape route you’ll use and post all the guards there.
What’s interesting about Problem 2 is that it makes what would be the normal game-theoretic strategy unstable by choosing deterministically where the probabilities are exactly equal.
Of course, you can just set up the thought experiment with the proviso that “be unpredictable” is not a possible move—in fact that’s the whole point of Omega in these sorts of problems. If Omega’s trying to break into your safe, he takes your money. In Nesov’s problem, if you can’t make yourself unpredictable, then you win nothing—it’s not even worth your time to open the box. In both cases, a TDT agent does strictly as well as it possibly could—the fact that there’s $100 somewhere in the vicinity doesn’t change that.