In Newcomb’s problem, a superintelligence called Omega shows you two boxes, A and B, and offers you the choice of taking only box A, or both boxes A and B. Omega has put $1,000 in box B. If Omega thinks you will take box A only, he has put $1,000,000 in it. Otherwise he has left it empty. Omega has played this game many times, and has never been wrong in his predictions about whether someone will take both boxes or not.
Though a controversial position, my audience has probably heard by now that the “rational” answer is to one-box. (If not, see Newcomb’s problem on the wiki).
I can do better.
The Deal
See, I’ve heard about Omega. I’m prepared. I installed the Universe Splitter app on my iPhone. It’s basically a quantum coin flip: both outcomes happen in their respective Everett branches.
Now, I’ve pre-committed that after Omega offers me The Deal, I’ll make two quantum coin flips. If I get two tails in a row, I’ll two-box. Otherwise, I’ll one-box.
It’s an interesting finding of game theory that sometimes a winning strategy is to deliberately limit yourself. If you’re playing a game of chicken for the fate of the human race, the winning strategy is to defiantly rip off the steering wheel and throw it out the window before your opponent does.
I’ve gained a third option in Newcomb’s game by deliberately limiting my knowledge of my future actions. I physically can’t know at the present time if I’ll choose one box or two. (As both outcomes do, in fact, happen assuming Many Worlds.) And crucially, Omega physically can’t know that either. Not until after it decides the contents of box A.
Therefore, Omega must predict that I’ll one-box. After all, it’s the higher probability. There’s a 75% chance. If you have a large urn filled with three purple skittles for each yellow skittle, then your most accurate prediction of a sequence of four draws must rationally be PPPP, rather than some permutation of YPPP, as one might naiively expect.
The expected value is $4,001,000 / 4 = $1,000,250. I just cheated Omega out of an expected $250 over one-boxing.
Having proven a strategy superior to one-boxing, I can claim that if your decision theory just one-boxes without pre-committing to use quantum dice, something is wrong with it.
Even better
Of course, we can do even better than this. Two coin flips are easy to think about, but you could generate enough quantum noise to two-box with any probability you choose. The optimal case approaches 50%. You could aim for an expected payoff of $1,000,499.99...
In other words the limit is no more than $500 better than one-boxing by using a quantum dice strategy.
But the perfect predictor Omega is only the limiting case. Newcomblike reasoning can apply even if the predictor is imperfect. Does this change the optimal strategy?
Let’s try extreme cases. Suppose Omega predicts then rolls a quantum d100 and flips its prediction on a roll of 42. This gives it a record of 99% accuracy. The two-flip strategy looks much like before.
For an expected payoff of $501,250.00. You expect only $501,000.00 by one-boxing. It’s the same difference of $250. But what if you two-box?
50.1% chance of predicted two-box * $1000 = $501 49.9% chance of predicted one-box * $1,001,000 = $499,499 Expected payoff of two-boxing: $500,000
It’s interesting that two flips is still $250 better than one-boxing.
Now for the other limit. Suppose Omega just flips a coin to predict. The game is no longer Newcomblike. Expected payoff for one-boxing is $500,000; for two-boxing it is $500,500. That’s $500 better. The two-flip strategy is only $500,250 or $250 better than one-boxing. When Omega’s prediction is no better than chance, you might as well two-box. But recall that the quantum dice strategy in the limit also makes an expected $500 more than one-boxing.
This is regardless of how well Omega can predict your choice. Given quantum dice, Newcomb’s problem is not Newcomblike.
While it’s a useful intuition pump, the above argument doesn’t appear to require the Many Worlds interpretation to work. (Though Many Worlds is probably correct.) The dice may not even have to be quantum. They just have to be unpredictably random.
The qualification “given quantum dice” is not vacuous. A simple computer algorithm isn’t good enough against a sufficiently advanced predictor. Pseudorandom sequences can be reproduced and predicted. The argument requires actual hardware. Hardware that some humans posses, but not innately. If you want to call that cheating, see the title, but also recall that we’re interested in Newcomblike problems not just for human rationality, but for a better theory of idealized agents, which will help us reason about AI problems. AIs that could perhaps read each other’s source code as part of their negotiations. How do you negotiate with an agent capable of wielding quantum randomness?
Some questions
Can other Newcomblike games be subverted this way?
Can the predictor counter this? If given quantum dice?
If the games are iterated, what’s the behavior in the limit? Taking Newcomb’s problem as an example, assume Omega’s goal is to cause one-boxing. A subgoal would be to maintain high accuracy, so the choosers believe in Omega’s accuracy. Recall that if Omega can’t predict better than chance, the optimal strategy is to two-box. Assume the chooser’s goal is to maximize return.
A chooser might want use dice to be as close to 50% as possible, while still being predictably more likely to one-box. We know the rational choice given skittles in an urn, but should Omega deliberately punish such defectors by biasing in favor of two-box predictions? Even though this reduces Omega’s accuracy? With one chooser, this starts to look like iterated prisoner’s dilemma. With multiple choosers, the tragedy of the commons. Even if Omega tries its best, its track record will suffer once choosers start using dice. How is the next chooser supposed to tell if this is because Omega is defecting (or defective) or if other choosers are just that unpredictable?
So last week, the heavens opened, Omega descended from outer space, and offered me The Deal.
With a smirk, I whipped out my iPhone and opened the Universe Splitter.
“Oh, one more thing, Chooser,” said Omega.
I froze, then slowly looked up from my iPhone. Omega couldn’t possibly be changing the rules on me. That wasn’t The Deal. And it’s hard to make optimal decisions under time pressure, you know?
“What’s that?” I asked cautiously.
“I noticed you trying to split the universe just now.”. Nothing gets past you, huh? Omega continued, “I thought you should know, I’m actually not omniscient. I’m not even a super predictor or anything.”
“But...you’ve never been wrong,” I countered, incredulous.
Omega pressed on, “I only ‘predicted’ your choice by quantum coin flip. However, I’ve always pre-committed to immediately destroy the universe if I turn out to have predicted wrong. I’ve done it this way for every Deal. You, of course, can only find yourself in an Everett branch where I have not (yet) destroyed the universe. My apparent perfect track record is purely due to anthropic effects. Just saying.”
I felt my pulse quicken as my eyes slowly widened in horror.
“YOU CAN’T DO THAT!” I raged.
“Chooser, I am a thought experiment! I can do anything you can imagine.” Omega declared triumphantly. “If you think about it carefully, you will realize that I have never even changed The Deal. Now choose.”
I began frantically typing cases in my iPhone. If I stick to my original plan, there are eight possible outcomes, based on Omega’s coin flip “prediction”, and my two coin flips:
1 HH = $1,000,000 1 HT = $1,000,000 1 TH = $1,000,000 1 TT = $1,001,000, but universe is destroyed 2 HH = $1,000, but universe is destroyed 2 HT = $1,000, but universe is destroyed 2 TH = $1,000, but universe is destroyed 2 TT = $1,000
“I don’t want to get destroyed, but since my counterpart in the other branch created by Omega’s ‘prediction’ event will be making his decision by the same process, it looks like I’ll be losing half of my measure either way. Maybe I should just ignore the branches where I don’t exist,” I thought aloud as I deleted the lines.
My expected payoff is now $3,001,000 / 4 = $750,250. Looking at it this way, it’s clear that I should just one-box.
I reach out to take only box A, but hesitated.
As tears welled up, I wondered: was I about to die? I was briefly tempted to do one quantum coin flip to decide anyway, just so that I can be sure a future from this point exists. But then I realized that regardless of the outcome I’d just be tempted to do so again and again, forever. That way lies madness.
If all memory of what happened since you woke up this morning is erased or replaced, but then life continues as normal, have you died? Don’t you do something similar every night? If I had known Omega’s evil plan from the start, would my decision have changed?
No. It wouldn’t have. I swallowed my tears. I held my breath. I picked up box A. Omega retrieved box B and flew off into the sunset.
Omega, of course, “predicted” this perfectly, and box A contained my expected million.
It didn’t occur to me until the next day that Omega might have been bluffing. Why did I just take Omega’s word for it?
Cheating Omega
Though a controversial position, my audience has probably heard by now that the “rational” answer is to one-box. (If not, see Newcomb’s problem on the wiki).
I can do better.
The Deal
See, I’ve heard about Omega. I’m prepared. I installed the Universe Splitter app on my iPhone. It’s basically a quantum coin flip: both outcomes happen in their respective Everett branches.
Now, I’ve pre-committed that after Omega offers me The Deal, I’ll make two quantum coin flips. If I get two tails in a row, I’ll two-box. Otherwise, I’ll one-box.
It’s an interesting finding of game theory that sometimes a winning strategy is to deliberately limit yourself. If you’re playing a game of chicken for the fate of the human race, the winning strategy is to defiantly rip off the steering wheel and throw it out the window before your opponent does.
I’ve gained a third option in Newcomb’s game by deliberately limiting my knowledge of my future actions. I physically can’t know at the present time if I’ll choose one box or two. (As both outcomes do, in fact, happen assuming Many Worlds.) And crucially, Omega physically can’t know that either. Not until after it decides the contents of box A.
Therefore, Omega must predict that I’ll one-box. After all, it’s the higher probability. There’s a 75% chance. If you have a large urn filled with three purple skittles for each yellow skittle, then your most accurate prediction of a sequence of four draws must rationally be PPPP, rather than some permutation of YPPP, as one might naiively expect.
There are now four possible outcomes:
The expected value is $4,001,000 / 4 = $1,000,250. I just cheated Omega out of an expected $250 over one-boxing.
Having proven a strategy superior to one-boxing, I can claim that if your decision theory just one-boxes without pre-committing to use quantum dice, something is wrong with it.
Even better
Of course, we can do even better than this. Two coin flips are easy to think about, but you could generate enough quantum noise to two-box with any probability you choose. The optimal case approaches 50%. You could aim for an expected payoff of $1,000,499.99...
In other words the limit is no more than $500 better than one-boxing by using a quantum dice strategy.
But the perfect predictor Omega is only the limiting case. Newcomblike reasoning can apply even if the predictor is imperfect. Does this change the optimal strategy?
Let’s try extreme cases. Suppose Omega predicts then rolls a quantum d100 and flips its prediction on a roll of 42. This gives it a record of 99% accuracy. The two-flip strategy looks much like before.
99% chance of predicted one-box:
1% chance of predicted two-box:
Payoffs are
For an expected payoff of $990,250. You expect only $990,000 by one-boxing, and only $11,000 by two-boxing. Two flips still works.
In the other extreme, suppose Omega flips 49.9% of its answers. This gives it a record of 50.1% accuracy.
50.1% chance of predicted one-box:
49.9% chance of predicted two-box:
Payoffs are
For an expected payoff of $501,250.00. You expect only $501,000.00 by one-boxing. It’s the same difference of $250. But what if you two-box?
It’s interesting that two flips is still $250 better than one-boxing.
Now for the other limit. Suppose Omega just flips a coin to predict. The game is no longer Newcomblike. Expected payoff for one-boxing is $500,000; for two-boxing it is $500,500. That’s $500 better. The two-flip strategy is only $500,250 or $250 better than one-boxing. When Omega’s prediction is no better than chance, you might as well two-box. But recall that the quantum dice strategy in the limit also makes an expected $500 more than one-boxing.
This is regardless of how well Omega can predict your choice. Given quantum dice, Newcomb’s problem is not Newcomblike.
While it’s a useful intuition pump, the above argument doesn’t appear to require the Many Worlds interpretation to work. (Though Many Worlds is probably correct.) The dice may not even have to be quantum. They just have to be unpredictably random.
The qualification “given quantum dice” is not vacuous. A simple computer algorithm isn’t good enough against a sufficiently advanced predictor. Pseudorandom sequences can be reproduced and predicted. The argument requires actual hardware. Hardware that some humans posses, but not innately. If you want to call that cheating, see the title, but also recall that we’re interested in Newcomblike problems not just for human rationality, but for a better theory of idealized agents, which will help us reason about AI problems. AIs that could perhaps read each other’s source code as part of their negotiations. How do you negotiate with an agent capable of wielding quantum randomness?
Some questions
Can other Newcomblike games be subverted this way?
Can the predictor counter this? If given quantum dice?
If the games are iterated, what’s the behavior in the limit? Taking Newcomb’s problem as an example, assume Omega’s goal is to cause one-boxing. A subgoal would be to maintain high accuracy, so the choosers believe in Omega’s accuracy. Recall that if Omega can’t predict better than chance, the optimal strategy is to two-box. Assume the chooser’s goal is to maximize return.
A chooser might want use dice to be as close to 50% as possible, while still being predictably more likely to one-box. We know the rational choice given skittles in an urn, but should Omega deliberately punish such defectors by biasing in favor of two-box predictions? Even though this reduces Omega’s accuracy? With one chooser, this starts to look like iterated prisoner’s dilemma. With multiple choosers, the tragedy of the commons. Even if Omega tries its best, its track record will suffer once choosers start using dice. How is the next chooser supposed to tell if this is because Omega is defecting (or defective) or if other choosers are just that unpredictable?
So last week, the heavens opened, Omega descended from outer space, and offered me The Deal.
With a smirk, I whipped out my iPhone and opened the Universe Splitter.
“Oh, one more thing, Chooser,” said Omega.
I froze, then slowly looked up from my iPhone. Omega couldn’t possibly be changing the rules on me. That wasn’t The Deal. And it’s hard to make optimal decisions under time pressure, you know?
“What’s that?” I asked cautiously.
“I noticed you trying to split the universe just now.”. Nothing gets past you, huh? Omega continued, “I thought you should know, I’m actually not omniscient. I’m not even a super predictor or anything.”
“But...you’ve never been wrong,” I countered, incredulous.
Omega pressed on, “I only ‘predicted’ your choice by quantum coin flip. However, I’ve always pre-committed to immediately destroy the universe if I turn out to have predicted wrong. I’ve done it this way for every Deal. You, of course, can only find yourself in an Everett branch where I have not (yet) destroyed the universe. My apparent perfect track record is purely due to anthropic effects. Just saying.”
I felt my pulse quicken as my eyes slowly widened in horror.
“YOU CAN’T DO THAT!” I raged.
“Chooser, I am a thought experiment! I can do anything you can imagine.” Omega declared triumphantly. “If you think about it carefully, you will realize that I have never even changed The Deal. Now choose.”
I began frantically typing cases in my iPhone. If I stick to my original plan, there are eight possible outcomes, based on Omega’s coin flip “prediction”, and my two coin flips:
“I don’t want to get destroyed, but since my counterpart in the other branch created by Omega’s ‘prediction’ event will be making his decision by the same process, it looks like I’ll be losing half of my measure either way. Maybe I should just ignore the branches where I don’t exist,” I thought aloud as I deleted the lines.
My expected payoff is now $3,001,000 / 4 = $750,250. Looking at it this way, it’s clear that I should just one-box.
I reach out to take only box A, but hesitated.
As tears welled up, I wondered: was I about to die? I was briefly tempted to do one quantum coin flip to decide anyway, just so that I can be sure a future from this point exists. But then I realized that regardless of the outcome I’d just be tempted to do so again and again, forever. That way lies madness.
If all memory of what happened since you woke up this morning is erased or replaced, but then life continues as normal, have you died? Don’t you do something similar every night? If I had known Omega’s evil plan from the start, would my decision have changed?
No. It wouldn’t have. I swallowed my tears. I held my breath. I picked up box A. Omega retrieved box B and flew off into the sunset.
Omega, of course, “predicted” this perfectly, and box A contained my expected million.
It didn’t occur to me until the next day that Omega might have been bluffing. Why did I just take Omega’s word for it?