Answering your questions 13 years later because I want to cite cousin_it’s list of open problems, and others may see your comment and wonder what the answers are. I’m not sure about 4 on his list but I think 1, 2, 3, and 5 are definitely still open problems.
How is this [1. 2TDT-1CDT] not resolved?
Consider the evolutionary version of this. Suppose there’s a group of TDT (or UDT or FDT) agents and one of them got a random mutation that changed it into a CDT agent, and this was known to everyone (but not the identity of the CDT agent). If two randomly selected agents paired off to play true PD against each other, a TDT agent would play C (since they’re still likely facing another TDT agent) and the CDT agent would play D. So the CDT agent would be better off, and it would want to be really careful not to become a TDT agent or delegate to a TDT AI or become accidentally correlated with TDT agents. This doesn’t necessarily mean that TDT/UDT/FDT is wrong, but seems like a weird outcome, plus how do we know that we’re not in a situation like the one that the CDT agent is in (i.e., should be very careful not to become/delegate to a TDT-like agent)?
Eliezer also ended up thinking this might be a real issue.
This [2. Agent simulates predictor] basically says that the predictor is a rock, doesn’t depend on agent’s decision, which makes the agent lose because of the way problem statement argues into stipulating (outside of predictor’s own decision process) that this must be a two-boxing rock rather than a one-boxing rock.
No, we’re not saying the predictor is a rock. We’re assuming that the predictor is using some kind reasoning process to make the prediction. Specifically the predictor could reasoning as follows: The agent is using UDT1.1 (for example). UDT1.1 is not updateless with regard to logical facts. Given enough computing power (which the agent has) it will inevitably simulate me and then update on my prediction, after which it will view two-boxing as having higher expected utility (no matter what my prediction actually is). Therefore I should predict that it will two-box.
[3. The stupid winner paradox] Same as (2). We stipulate the weak player to be a $9 rock. Nothing to be surprised about.
No, again the weak player is applying reasoning to decide to demand $9, similar to the reasoning of the predictor above. To spell it out: My opponent is not logically updateless. Whatever I decide, it will simulate me and update on my decision, after which it will play the best response against that. Therefore I should demand $9.
(“Be logically updateless” is the seemingly obvious implication here, but how to do that without running into other issues is also an open problem.)
I currently prefer the version of ASP where the agent has the option of simulating the predictor, but doesn’t have to do so or to make any particular use of the results if it does. Dependencies are contingent, behavior of either party can result in some property of one of them (or of them together jointly) not getting observed (through reasoning) by the other, including in a way predictable to the process of choosing said behavior.
Answering your questions 13 years later because I want to cite cousin_it’s list of open problems, and others may see your comment and wonder what the answers are. I’m not sure about 4 on his list but I think 1, 2, 3, and 5 are definitely still open problems.
Consider the evolutionary version of this. Suppose there’s a group of TDT (or UDT or FDT) agents and one of them got a random mutation that changed it into a CDT agent, and this was known to everyone (but not the identity of the CDT agent). If two randomly selected agents paired off to play true PD against each other, a TDT agent would play C (since they’re still likely facing another TDT agent) and the CDT agent would play D. So the CDT agent would be better off, and it would want to be really careful not to become a TDT agent or delegate to a TDT AI or become accidentally correlated with TDT agents. This doesn’t necessarily mean that TDT/UDT/FDT is wrong, but seems like a weird outcome, plus how do we know that we’re not in a situation like the one that the CDT agent is in (i.e., should be very careful not to become/delegate to a TDT-like agent)?
Eliezer also ended up thinking this might be a real issue.
No, we’re not saying the predictor is a rock. We’re assuming that the predictor is using some kind reasoning process to make the prediction. Specifically the predictor could reasoning as follows: The agent is using UDT1.1 (for example). UDT1.1 is not updateless with regard to logical facts. Given enough computing power (which the agent has) it will inevitably simulate me and then update on my prediction, after which it will view two-boxing as having higher expected utility (no matter what my prediction actually is). Therefore I should predict that it will two-box.
No, again the weak player is applying reasoning to decide to demand $9, similar to the reasoning of the predictor above. To spell it out: My opponent is not logically updateless. Whatever I decide, it will simulate me and update on my decision, after which it will play the best response against that. Therefore I should demand $9.
(“Be logically updateless” is the seemingly obvious implication here, but how to do that without running into other issues is also an open problem.)
I currently prefer the version of ASP where the agent has the option of simulating the predictor, but doesn’t have to do so or to make any particular use of the results if it does. Dependencies are contingent, behavior of either party can result in some property of one of them (or of them together jointly) not getting observed (through reasoning) by the other, including in a way predictable to the process of choosing said behavior.