It took me a week to think about it. Then I read all the comments, and thought about it some more. And now I think I have this “problem” well in hand. I also think that, incidentally, I arrived at Eliezer’s answer as well, though since he never spelled it out I can’t be sure.
To be clear—a lot of people have said that the decision depends on the problem parameters, so I’ll explain just what it is I’m solving. See, Eliezer wants our decision theory to WIN. That implies that we have all the relevant information—we can think of a lot of situations where we make the wisest decision possible based on available information and it turns out to be wrong; the universe is not fair, we know this already. So I will assume we have all the relevant information needed to win. We will also assume that Omega does have the capability to accurately predict my actions; and that causality is not violated (rationality cannot be expected to win if causality is violated!).
Assuming this, I can have a conversation with Omega before it leaves. Mind you, it’s not a real conversation, but having sufficient information about the problem means I can simulate its part of the conversation even if Omega itself refuses to participate and/or there isn’t enough time for such a conversation to take place. So it goes like this...
Me: “I do want to gain as much as possible in this problem. For that effect I will want you to put as much money in the box as possible. How do I do that?”
Omega: “I will put 1M$ in the box if you take only it; and nothing if you take both.”
Me: “Ah, but we’re not violating causality here, are we? That would be cheating!”
Omega: “True, causality is not violated. To rephrase, my decision on how much money to put in the box will depend on my prediction of what you will do. Since I have this capacity, we can consider these synonymous.”
Me: “Suppose I’m not convinced that they are truly synonymous. All right then. I intend to take only the one box”.
Omega: “Remember that I have the capability to predict your actions. As such I know if you are sincere or not.”
Me: “You got me. Alright, I’ll convince myself really hard to take only the one box.”
Omega: “Though you are sincere now, in the future you will reconsider this decision. As such, I will still place nothing in the box.”
Me: “And you are predicting all this from my current state, right? After all, this is one of the parameters in the problem—that after you’ve placed money in the boxes, you are gone and can’t come back to change it”.
Omega: “That is correct; I am predicting a future state from information on your current state”.
Me: “Aha! That means I do have a choice here, even before you have left. If I change my state so that I am unable or unwilling to two-box once you’ve left, then your prediction of my future “decision” will be different. In effect, I will be hardwired to one-box. And since I still want to retain my rationality, I will make sure that this hardwiring is strictly temporary.”
fiddling with my own brain a bit
Omega: “I have now determined that you are unwilling to take both boxes. As such, I will put the 1,000,000$ in the box.”
Omega departs
I walk unthinkingly toward the boxes and take just the one
Voila. Victory is achieved.
My main conclusion is here is that any decision theory that does not allow for changing strategies is a poor decision theory indeed. This IS essentially the Friendly AI problem: You can rationally one-box, but you need to have access to your own source code in order to do so. Not having that would so inflexible as to be the equivalent of an Iterative Prisoner’s Dilemma program that can only defect or only cooperate; that is, a very bad one.
The reason this is not obvious is that the way the problem is phrased is misleading. Omega supposedly leaves “before you make your choice”, but in fact there is not a single choice here (one-box or two-box). Rather, there are two decisions to be made, if you can modify your own thinking process:
Whether or not to have the ability and inclination to make decision #2 “rationally” once Omega has left, and
Whether to one-box or two-box.
...Where decision #1 can and should be made prior to Omega’s leaving, and obviously DOES influence what’s in the box.
Decision #2 does not influence what’s in the box, but the state in which I approach that decision does. This is very confusing initially.
Now, I don’t really know CDT too well, but it seems to me that presented as these two decisions, even it would be able to correctly one-box on Newcomb’s problem.
Am I wrong?
Eliezer—if you are still reading these comments so long after the article was published—I don’t think it’s an inconsistency in the AI’s decision making if the AI’s decision making is influenced by its internal state. In fact I expect that to be the case. What am I missing here?
Let me try my own stab at a little chat with Omega. By the end of the chat I will either have 1001 K, or give up. Right now, I don’t know which.
Act I
Everything happens pretty much as it did in Polymeron’s dialogue, up until…
Me: “Aha! That means I do have a choice here, even before you have left. If I change my state so that I am unable or unwilling to two-box once you’ve left, then your prediction of my future “decision” will be different. In effect, I will be hardwired to one-box. And since I still want to retain my rationality, I will make sure that this hardwiring is strictly temporary.”
Omega: Yup, that’ll work. So you’re happy with your 1000 K?
Act II
Whereupon I try to exploit randomness.
Me: Actually, no. I’m not happy. I want the entire 1001 K. Any suggestions for outsmarting you?
Omega: Nope.
Me: Are you omniscient?
Omega: As far as you’re concerned, yes. Your human physicists might disagree in general, but I’ve got you pretty much measured.
Me: Okay, then. Wanna make a bet? I bet I can find a to get over 1000 K if I make a bet with you. You estimate your probability of being right at 100%, right? Nshepperd had a good suggestion….
Omega: I won’t play this game. Or let you play it with anyone else. I thought we’d moved past that.
Me: How about I flip a fair coin to decide between B and A+B. In fact, I’ll use ’s generator using the principle to generate the outcome of a truly random coin flip. Even you can’t predict the outcome.
Omega: And what do you expect to happen as a result of this (not-as-clever-as-you-think) strategy?
Me: Since you can’t predict what I’ll do, hopefully you’ll fill both boxes. Then there’s a true 50% chance of me getting 1001 K. My expected payoff is 1000.5 K.
Omega: That, of course, is assuming I’ll fill both boxes.
Me: Oh, I’ll make you fill both boxes. I’ll bias the ’s to 50+eps% chance of one-boxing for the expected winnings of 1000.5 K – eps. Then if you want to maximize your omniscience-y-ness, you’ll have to fill both boxes.
Omega: Oh, taking others’ suggestions already? Can’t think for yourself? Making edits to make it look like you’d thought of it in time? Fair enough. Attribute this one to gurgeh. As to the idea itself, I’ll disincentivize you from randomization at all. I won’t fill box B if I predict you cheating.
Me: But then there’s a 50-eps% chance of proving you wrong. I’ll take it. MWAHAHA.
Omega: What an idiot. You’re not trying to prove me wrong. You’re trying to maximize your own profit.
Me: The only reason I don’t insult you back is because I operate under Crackers Rule.
Omega: Crocker’s Rules.
Me: Uh. Right. Whoops.
Omega: Besides. Your ’s random generator idea won’t work even to get you the cheaters’ utility for proving me wrong.
Me: Why not? I thought we’d established that you can’t predict a truly random outcome.
Omega: I don’t need to. I can just mess with your ’s randomness generator so that it gives out pseudo-random numbers instead.
Me: You’re omnipotent now, too?
Omega: Nope. I’ll just give someone a million dollars to do something silly.
Me: No one would ever…! Oh, wait. Anyway, I’ll be able to detect tampering with randomness, the same way it’s possible with a Mersenne twister….
Omega: And I know exactly how soon you’ll give up. Oh, and don’t waste page space suggesting secondary and tertiary levels of ensuring randomness. If, to guide your behavior, you’re using the table of random numbers that I already have, then I already know what you’d do.
Me: Is there any way at all of outsmarting you and getting 1001 K?
Omega: Not one you can find.
Me: Okay then… let me consult smarter people.
This conversation is obviously not going my way. Any suggestions for Act III?
It took me a week to think about it. Then I read all the comments, and thought about it some more. And now I think I have this “problem” well in hand. I also think that, incidentally, I arrived at Eliezer’s answer as well, though since he never spelled it out I can’t be sure.
To be clear—a lot of people have said that the decision depends on the problem parameters, so I’ll explain just what it is I’m solving. See, Eliezer wants our decision theory to WIN. That implies that we have all the relevant information—we can think of a lot of situations where we make the wisest decision possible based on available information and it turns out to be wrong; the universe is not fair, we know this already. So I will assume we have all the relevant information needed to win. We will also assume that Omega does have the capability to accurately predict my actions; and that causality is not violated (rationality cannot be expected to win if causality is violated!).
Assuming this, I can have a conversation with Omega before it leaves. Mind you, it’s not a real conversation, but having sufficient information about the problem means I can simulate its part of the conversation even if Omega itself refuses to participate and/or there isn’t enough time for such a conversation to take place. So it goes like this...
Me: “I do want to gain as much as possible in this problem. For that effect I will want you to put as much money in the box as possible. How do I do that?”
Omega: “I will put 1M$ in the box if you take only it; and nothing if you take both.”
Me: “Ah, but we’re not violating causality here, are we? That would be cheating!”
Omega: “True, causality is not violated. To rephrase, my decision on how much money to put in the box will depend on my prediction of what you will do. Since I have this capacity, we can consider these synonymous.”
Me: “Suppose I’m not convinced that they are truly synonymous. All right then. I intend to take only the one box”.
Omega: “Remember that I have the capability to predict your actions. As such I know if you are sincere or not.”
Me: “You got me. Alright, I’ll convince myself really hard to take only the one box.”
Omega: “Though you are sincere now, in the future you will reconsider this decision. As such, I will still place nothing in the box.”
Me: “And you are predicting all this from my current state, right? After all, this is one of the parameters in the problem—that after you’ve placed money in the boxes, you are gone and can’t come back to change it”.
Omega: “That is correct; I am predicting a future state from information on your current state”.
Me: “Aha! That means I do have a choice here, even before you have left. If I change my state so that I am unable or unwilling to two-box once you’ve left, then your prediction of my future “decision” will be different. In effect, I will be hardwired to one-box. And since I still want to retain my rationality, I will make sure that this hardwiring is strictly temporary.”
fiddling with my own brain a bit
Omega: “I have now determined that you are unwilling to take both boxes. As such, I will put the 1,000,000$ in the box.”
Omega departs
I walk unthinkingly toward the boxes and take just the one
Voila. Victory is achieved.
My main conclusion is here is that any decision theory that does not allow for changing strategies is a poor decision theory indeed. This IS essentially the Friendly AI problem: You can rationally one-box, but you need to have access to your own source code in order to do so. Not having that would so inflexible as to be the equivalent of an Iterative Prisoner’s Dilemma program that can only defect or only cooperate; that is, a very bad one.
The reason this is not obvious is that the way the problem is phrased is misleading. Omega supposedly leaves “before you make your choice”, but in fact there is not a single choice here (one-box or two-box). Rather, there are two decisions to be made, if you can modify your own thinking process:
Whether or not to have the ability and inclination to make decision #2 “rationally” once Omega has left, and
Whether to one-box or two-box.
...Where decision #1 can and should be made prior to Omega’s leaving, and obviously DOES influence what’s in the box. Decision #2 does not influence what’s in the box, but the state in which I approach that decision does. This is very confusing initially.
Now, I don’t really know CDT too well, but it seems to me that presented as these two decisions, even it would be able to correctly one-box on Newcomb’s problem. Am I wrong?
Eliezer—if you are still reading these comments so long after the article was published—I don’t think it’s an inconsistency in the AI’s decision making if the AI’s decision making is influenced by its internal state. In fact I expect that to be the case. What am I missing here?
Let me try my own stab at a little chat with Omega. By the end of the chat I will either have 1001 K, or give up. Right now, I don’t know which.
Act I
Everything happens pretty much as it did in Polymeron’s dialogue, up until…
Omega: Yup, that’ll work. So you’re happy with your 1000 K?
Act II
Whereupon I try to exploit randomness.
Me: Actually, no. I’m not happy. I want the entire 1001 K. Any suggestions for outsmarting you?
Omega: Nope.
Me: Are you omniscient?
Omega: As far as you’re concerned, yes. Your human physicists might disagree in general, but I’ve got you pretty much measured.
Me: Okay, then. Wanna make a bet? I bet I can find a to get over 1000 K if I make a bet with you. You estimate your probability of being right at 100%, right? Nshepperd had a good suggestion….
Omega: I won’t play this game. Or let you play it with anyone else. I thought we’d moved past that.
Me: How about I flip a fair coin to decide between B and A+B. In fact, I’ll use ’s generator using the principle to generate the outcome of a truly random coin flip. Even you can’t predict the outcome.
Omega: And what do you expect to happen as a result of this (not-as-clever-as-you-think) strategy?
Me: Since you can’t predict what I’ll do, hopefully you’ll fill both boxes. Then there’s a true 50% chance of me getting 1001 K. My expected payoff is 1000.5 K.
Omega: That, of course, is assuming I’ll fill both boxes.
Me: Oh, I’ll make you fill both boxes. I’ll bias the ’s to 50+eps% chance of one-boxing for the expected winnings of 1000.5 K – eps. Then if you want to maximize your omniscience-y-ness, you’ll have to fill both boxes.
Omega: Oh, taking others’ suggestions already? Can’t think for yourself? Making edits to make it look like you’d thought of it in time? Fair enough. Attribute this one to gurgeh. As to the idea itself, I’ll disincentivize you from randomization at all. I won’t fill box B if I predict you cheating.
Me: But then there’s a 50-eps% chance of proving you wrong. I’ll take it. MWAHAHA.
Omega: What an idiot. You’re not trying to prove me wrong. You’re trying to maximize your own profit.
Me: The only reason I don’t insult you back is because I operate under Crackers Rule.
Omega: Crocker’s Rules.
Me: Uh. Right. Whoops.
Omega: Besides. Your ’s random generator idea won’t work even to get you the cheaters’ utility for proving me wrong.
Me: Why not? I thought we’d established that you can’t predict a truly random outcome.
Omega: I don’t need to. I can just mess with your ’s randomness generator so that it gives out pseudo-random numbers instead.
Me: You’re omnipotent now, too?
Omega: Nope. I’ll just give someone a million dollars to do something silly.
Me: No one would ever…! Oh, wait. Anyway, I’ll be able to detect tampering with randomness, the same way it’s possible with a Mersenne twister….
Omega: And I know exactly how soon you’ll give up. Oh, and don’t waste page space suggesting secondary and tertiary levels of ensuring randomness. If, to guide your behavior, you’re using the table of random numbers that I already have, then I already know what you’d do.
Me: Is there any way at all of outsmarting you and getting 1001 K?
Omega: Not one you can find.
Me: Okay then… let me consult smarter people.
This conversation is obviously not going my way. Any suggestions for Act III?