Here are the variants which make no explicit mention of TDT anywhere in the problem statement. It seems a real strain to describe either of them as unfair to TDT. Yet TDT will be outperformed on them by CDT; unless it resolves never to allow itself to be outperformed on any problem (in TDT über alles fashion)
Problem 1: Omega (who experience has shown is always truthful) presents the usual two boxes A and B and announces the following. “Before you entered the room, I selected an agent at random from the following distribution over all full source-codes for decision theory agents (insert distribution). I then simulated the result of presenting this exact problem to that agent. I won’t tell you what the agent decided, but I will tell you that if the agent two-boxed then I put nothing in Box B, whereas if the agent one-boxed then I put big Value-B in Box B. Regardless of how the simulated agent decided, I put small Value-A in Box A. Now please choose your box or boxes.”
Problem 2: Our ever-reliable Omega now presents ten boxes, numbered from 1 to 10, and announces the following. “Exactly one of these boxes contains $1 million; the others contain nothing. You must take exactly one box to win the money; if you try to take more than one, then you won’t be allowed to keep any winnings. Before you entered the room, I ran multiple simulations of this problem as presented to different agents, sampled uniformly from different possible future universes according to their relative numbers, with the universes themselves sampled from my best projections of the future. I determined the box which the agents were least likely to take. If there were several such boxes tied for equal-lowest probability, then I just selected one of them, the one labelled with the smallest number. I then placed $1 million in the selected box. Please choose your box.”
unless it resolves never to allow itself to be outperformed on any problem (in TDT über alles fashion).
This is not actually possible. You can always play the “I simulated you and put the money in the place you don’t choose” game.
It seems a real strain to describe either of them as unfair to TDT.
From this side of the screen, this looks like a property of you, not the problems. If we replace the statement about “relative numbers” in the future (we were having to make assumptions about that anyhow, so let’s just save time and stick in the assumptions), then problem 2 reads “I simulated the best decision theory by definition X and put the money in the place it doesn’t choose.” This demonstrates that no matter how good a decision theory is by any definition, it can still get hosed by Omega. In this case we’re assuming that definition X is maximized by TDT (thus, it’s a unique specification), and yea, TDT did go forth and get hosed by Omega.
This is not actually possible. You can always play the “I simulated you and put the money in the place you don’t choose” game
But the obvious response to that game is randomisation among the choice options: there is no guarantee of winning, but no-one else can do better than you either. It takes a new “twist” on the problem to defeat the randomisation approach, and show that another agent type can do better.
I did ask on my original post (on Problematic Problems) whether that “twist” had been proposed or studied before. There were no references, but if you have one, please let me know.
It seems a real strain to describe either of them as unfair to TDT.
From this side of the screen, this looks like a property of you, not the problems. If we replace the statement about “relative numbers” in the future (we were having to make assumptions about that anyhow, so let’s just save time and stick in the assumptions), then problem 2 reads “I simulated the best decision theory by definition X and put the money in the place it doesn’t choose.” This demonstrates that no matter how good a decision theory is by any definition, it can still get hosed by Omega. In this case we’re assuming that definition X is maximized by TDT (thus, it’s a unique specification), and yea, TDT did go forth and get hosed by Omega.
So there’s a class of problems where failure is actually a good sign? Interesting. You might want to post further on that, actually.
Hm, yeah. After some computational work at least. Every decision procedure can get hosed by Omega, and the way in which it gets hosed is diagnostic of its properties. Though not uniquely, I guess, so you can’t say “it fails this special test therefore it is good.”
I think the clearest and simplest version of Problem 1 is where Omega chooses to simulate a CDT agent with .5 probability and a TDT agent with .5 probability. Let’s say that Value-B is $1000000, as is traditional, and Value-A is $1000. TDT will one-box for an expected value of $500500 (as opposed to $1000 if it two-boxes), and CDT will always two-box, and receive an expected $501000. Both TDT and CDT have an equal chance of playing against each other in this version, and an equal chance of playing against themselves, and yet CDT still outperforms. It seems TDT suffers for CDT’s irrationality, and CDT benefits from TDT’s rationality. Very troubling.
EDIT: (I will note, though, that a TDT agent still can’t do any better by two-boxing—only make CDT do worse).
Assuming that the space of possible agents is large enough:
For each individual version of a TDT agent, the best way is to two-box. The advantage of TDT is the possibility to improve the expected value for a whole range of agents (including cooperation with other TDT agents in the prisoners dilemma). CDT agents happen to profit from that, and they profit even more than TDT agents. Does TDT maximize the expected value for the whole distribution of agents? In that case, it is still optimal in that respect.
Problem 2 is sensitive to changes of arbitrary size: Assume that the space of TDT agents takes one box with probability 10%+epsilon and some other with 10%-epsilon. While the expectation value is the same within O(epsilon), the money is now in some other box and CDT would have to calculate that with the same precision. Apart from the experimental issue, I think this gives some theoretical challenges as well.
Here are the variants which make no explicit mention of TDT anywhere in the problem statement. It seems a real strain to describe either of them as unfair to TDT. Yet TDT will be outperformed on them by CDT; unless it resolves never to allow itself to be outperformed on any problem (in TDT über alles fashion)
Problem 1: Omega (who experience has shown is always truthful) presents the usual two boxes A and B and announces the following. “Before you entered the room, I selected an agent at random from the following distribution over all full source-codes for decision theory agents (insert distribution). I then simulated the result of presenting this exact problem to that agent. I won’t tell you what the agent decided, but I will tell you that if the agent two-boxed then I put nothing in Box B, whereas if the agent one-boxed then I put big Value-B in Box B. Regardless of how the simulated agent decided, I put small Value-A in Box A. Now please choose your box or boxes.”
Problem 2: Our ever-reliable Omega now presents ten boxes, numbered from 1 to 10, and announces the following. “Exactly one of these boxes contains $1 million; the others contain nothing. You must take exactly one box to win the money; if you try to take more than one, then you won’t be allowed to keep any winnings. Before you entered the room, I ran multiple simulations of this problem as presented to different agents, sampled uniformly from different possible future universes according to their relative numbers, with the universes themselves sampled from my best projections of the future. I determined the box which the agents were least likely to take. If there were several such boxes tied for equal-lowest probability, then I just selected one of them, the one labelled with the smallest number. I then placed $1 million in the selected box. Please choose your box.”
This is not actually possible. You can always play the “I simulated you and put the money in the place you don’t choose” game.
From this side of the screen, this looks like a property of you, not the problems. If we replace the statement about “relative numbers” in the future (we were having to make assumptions about that anyhow, so let’s just save time and stick in the assumptions), then problem 2 reads “I simulated the best decision theory by definition X and put the money in the place it doesn’t choose.” This demonstrates that no matter how good a decision theory is by any definition, it can still get hosed by Omega. In this case we’re assuming that definition X is maximized by TDT (thus, it’s a unique specification), and yea, TDT did go forth and get hosed by Omega.
But the obvious response to that game is randomisation among the choice options: there is no guarantee of winning, but no-one else can do better than you either. It takes a new “twist” on the problem to defeat the randomisation approach, and show that another agent type can do better.
I did ask on my original post (on Problematic Problems) whether that “twist” had been proposed or studied before. There were no references, but if you have one, please let me know.
I don’t have such a reference—so good job :D And yes, I was assuming that Omega was defeating randomization.
So there’s a class of problems where failure is actually a good sign? Interesting. You might want to post further on that, actually.
Hm, yeah. After some computational work at least. Every decision procedure can get hosed by Omega, and the way in which it gets hosed is diagnostic of its properties. Though not uniquely, I guess, so you can’t say “it fails this special test therefore it is good.”
I think the clearest and simplest version of Problem 1 is where Omega chooses to simulate a CDT agent with .5 probability and a TDT agent with .5 probability. Let’s say that Value-B is $1000000, as is traditional, and Value-A is $1000. TDT will one-box for an expected value of $500500 (as opposed to $1000 if it two-boxes), and CDT will always two-box, and receive an expected $501000. Both TDT and CDT have an equal chance of playing against each other in this version, and an equal chance of playing against themselves, and yet CDT still outperforms. It seems TDT suffers for CDT’s irrationality, and CDT benefits from TDT’s rationality. Very troubling.
EDIT: (I will note, though, that a TDT agent still can’t do any better by two-boxing—only make CDT do worse).
Assuming that the space of possible agents is large enough: For each individual version of a TDT agent, the best way is to two-box. The advantage of TDT is the possibility to improve the expected value for a whole range of agents (including cooperation with other TDT agents in the prisoners dilemma). CDT agents happen to profit from that, and they profit even more than TDT agents. Does TDT maximize the expected value for the whole distribution of agents? In that case, it is still optimal in that respect.
Problem 2 is sensitive to changes of arbitrary size: Assume that the space of TDT agents takes one box with probability 10%+epsilon and some other with 10%-epsilon. While the expectation value is the same within O(epsilon), the money is now in some other box and CDT would have to calculate that with the same precision. Apart from the experimental issue, I think this gives some theoretical challenges as well.