I’m not reading 127 comments, but as a newcomer who’s been invited to read this page, along with barely a dozen others, as an introduction, I don’t want to leave this unanswered, even though what I have to say has probably already been said.
First of all, the answer to Newcomb’s Problem depends a lot on precisely what the problem is. I have seen versions that posit time travel, and therefore backwards causality. In that case, it’s quite reasonable to take only one box, because your decision to do so does have a causal effect on the amount in Box B. Presumably causal decision theorists would agree.
However, in any version of the problem where there is no clear evidence of violations of currently known physics and where the money has been placed by Omega before my decisions, I am a two-boxer. Yet I think that your post above must not be talking about the same problem that I am thinking of, especially at the end. Although you never said so, it seems to me that you must be talking about a problem which says “If you choose Box B, then it will have a million dollars; if you choose both boxes, then Box B will be empty.”. But that is simply not what the facts will be if Omega has made the decision in the past and currently understood physics applies. In the problem as stated, Omega may make mistakes in the future, and that makes all the difference.
It’s presumptuous of me to assume that you’re talking about a different problem from the one that you stated, I know. But as I read the psychological states that you suggest that I might have —that I might wish that I considered one-boxing rational, for example—, they seem utterly insane. Why would I wish such a thing? What does it have to do with anything? The only thing that I can wish for is that Omega has predicted that I will be a one-boxer, which has nothing to do with what I consider rational now.
The quotation from Joyce explains it well, up until the end, where poor phrasing may have confused you. The last sentence should read:
When Rachel wishes she was Irene’s type she is wishing for Irene’s circumstances, not wishing to make Irene’s choice.
It is simply not true that Rachel envies Irene’s choice. Rachel envies Irene’s situation, the situation where there is a million dollars in Box B. And if Rachel were in that situation, then she would still take both boxes! (At least if I understand Joyce correctly.)
Possibly one thing that distinguishes me from one-boxers, and maybe even most two-boxers, is that I understand fundamental physics rather thoroughly and my prior has a very strong presumption against backwards causality. The mere fact that Omega has made successful predictions about Newcomb’s Paradox will never be enough to overrule that. Even being superintelligent and coming from another galaxy is not enough, although things change if Omega (known to be superintelligent and honest) claims to be a time-traveller. Perhaps for some one-boxers, and even for some irrational two-boxers, Omega’s past success at prediction is good evidence for backwards causality, but not for me.
So suppose that somebody puts two boxes down before me, presents convincing evidence for the situation as you stated it above (but no more), and goes away. Then I will simply take all of the money that this person has given me: both boxes. Before I open them, I will hope that they predicted that I will choose only one. After I open them, if I find Box B empty, then I will wish that they had predicted that I would choose only one. But I will not wish that I had chosen only one. And I certainly will not hope, beforehand, that I will choose only one and yet nevertheless choose two; that would indeed be irrational!
You are disposed to take two boxes. Omega can tell. (Perhaps by reading your comment. Heck, I can tell by reading your comment, and I’m not even a superintelligence.) Omega will therefore not put a million dollars in Box B if it sets you a Newcomb’s problem, because its decision to do so depends on whether you are disposed to take both boxes or not, and you are.
I am disposed to take one box. Omega can tell. (Perhaps by reading this comment. I bet you can tell by reading my comment, and I also bet that you’re not a superintelligence.) Omega will therefore put a million dollars in Box B if it sets me a Newcomb’s problem, because its decision to do so depends on whether I am disposed to take both boxes or not, and I’m not.
If we both get pairs of boxes to choose from, I will get a million dollars. You will get a thousand dollars. I will be monetarily better off than you.
But wait! You can fix this. All you have to do is be disposed to take just Box B. You can do this right now; there’s no reason to wait until Omega turns up. Omega does not care why you are so disposed, only that you are so disposed. You can mutter to yourself all you like about how silly the problem is; as long as you wander off with just B under your arm, it will tend to be the case that you end the day a millionaire.
Sometime ago I figured out a refutation of this kind of reasoning in Counterfactual Mugging, and it seems to apply in Newcomb’s Problem too. It goes as follows:
Imagine another god, Upsilon, that offers you a similar two-box setup—except to get the $2M in the box B, you must be a one-boxer with regard to Upsilon and a two-boxer with regard to Omega. (Upsilon predicts your counterfactual behavior if you’d met Omega instead.) Now you must choose your dispositions wisely because you can’t win money from both gods. The right disposition depends on your priors for encountering Omega or Upsilon, which is a “bead jar guess” because both gods are very improbable. In other words, to win in such problems, you can’t just look at each problem individually as it arises—you need to have the correct prior/predisposition over all possible predictors of your actions, before you actually meet any of them. Obtaining such a prior is difficult, so I don’t really know what I’m predisposed to do in Newcomb’s Problem if I’m faced with it someday.
Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I’m likely to encounter. Upsilon treats me on the basis of a guess I would subjunctively make without knowledge of Upsilon. It is therefore not surprising that I tend to do much better with Omega than with Upsilon, because the relevant choices being made by me are being made with much better knowledge. To put it another way, when Omega offers me a Newcomb’s Problem, I will condition my choice on the known existence of Omega, and all the Upsilon-like gods will tend to cancel out into Pascal’s Wagers. If I run into an Upsilon-like god, then, I am not overly worried about my poor performance—it’s like running into the Christian God, you’re screwed, but so what, you won’t actually run into one. Even the best rational agents cannot perform well on this sort of subjunctive hypothesis without much better knowledge while making the relevant choices than you are offering them. For every rational agent who performs well with respect to Upsilon there is one who performs poorly with respect to anti-Upsilon.
On the other hand, beating Newcomb’s Problem is easy, once you let go of the idea that to be “rational” means performing a strange ritual cognition in which you must only choose on the basis of physical consequences and not on the basis of correct predictions that other agents reliably make about you, so that (if you choose using this bizarre ritual) you go around regretting how terribly “rational” you are because of the correct predictions that others make about you. I simply choose on the basis of the correct predictions that others make about me, and so I do not regret being rational.
And these questions are highly relevant and realistic, unlike Upsilon; in the future we can expect there to be lots of rational agents that make good predictions about each other.
Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I’m likely to encounter.
In what sense can you update? Updating is about following a plan, not about deciding on a plan. You already know that it’s possible to observe anything, you don’t learn anything new about environment by observing any given thing. There could be a deep connection between updating and logical uncertainty that makes it a good plan to update, but it’s not obvious what it is.
Intuitively, the notion of updating a map of fixed reality makes sense, but in the context of decision-making, formalization in full generality proves elusive, even unnecessary, so far.
By making a choice, you control the truth value of certain statements—statements about your decision-making algorithm and about mathematical objects depending on your algorithm. Only some of these mathematical objects are part of the “real world”. Observations affect what choices you make (“updating is about following a plan”), but you must have decided beforehand what consequences you want to establish (“[updating is] not about deciding on a plan”). You could have decided beforehand to care only about mathematical structures that are “real”, but what characterizes those structures apart from the fact that you care about them?
This is not a refutation, because what you describe is not about the thought experiment. In the thought experiment, there are no Upsilons, and so nothing to worry about. It is if you face this scenario in real life, where you can’t be given guarantees about the absence of Upsilons, that your reasoning becomes valid. But it doesn’t refute the reasoning about the thought experiment where it’s postulated that there are no Upsilons.
Thanks for dropping the links here. FWIW, I agree with your objection. But at the very least, the people claiming they’re “one-boxers” should also make the distinction you make.
Also, user Nisan tried to argue that various Upsilons and other fauna must balance themselves out if we use the universal prior. We eventually took this argument to email, but failed to move each other’s positions.
OK. I assume the usual (Omega and Upsilon are both reliable and sincere, I can reliably distinguish one from the other, etc.)
Then I can’t see how the game doesn’t reduce to standard Newcomb, modulo a simple probability calculation, mostly based on “when I encounter one of them, what’s my probability of meeting the other during my lifetime?” (plus various “actuarial” calculations).
If I have no information about the probability of encountering either, then my decision may be incorrect—but there’s nothing paradoxical or surprising about this, it’s just a normal, “boring” example of an incomplete information problem.
you need to have the correct prior/predisposition over all possible predictors of
your actions, before you actually meet any of them.
I can’t see why that is—again, assuming that the full problem is explained to you on encountering either Upsilon or Omega, both are truhful, etc. Why can I not perform the appropriate calculations and make an expectation-maximising decision even after Upsilon-Omega has left? Surely Omega-Upsilon can predict that I’m going to do just that and act accordingly, right?
Yes, this is a standard incomplete information problem. Yes, you can do the calculations at any convenient time, not necessarily before meeting Omega. (These calculations can’t use the information that Omega exists, though.) No, it isn’t quite as simple as you state: when you meet Omega, you have to calculate the counterfactual probability of you having met Upsilon instead, and so on.
I’m pretty sure the logic is correct. I do make silly math mistakes sometimes, but I’ve tested this one on Vladimir Nesov and he agrees. No comment from Eliezer yet (this scenario was first posted to decision-theory-workshop).
Then I think the original Newcomb’s Problem should remind you of Pascal’s Wager just as much, and my scenario should be analogous to the refutation thereof. (Thereunto? :-)
But wait! You can fix this. All you have to do is be disposed to take just Box B.
No, that’s not what I should do. What I should do is make Omega think that I am disposed to take just Box B. If I can successfully make Omega think that I’ll take only Box B but still take both boxes, then I should. But since Omega is superintelligent, let’s take it as understood that the only way to make Omega think that I’ll take only Box B is to make it so that I’ll actually take Box B. Then that is what I should do.
But I have to do it now! (I don’t do it now only because I don’t believe that this situation will ever happen.) Once Omega has placed the boxes and left, if the known laws of physics apply, then it’s too late!
If you take only Box B and get a million dollars, wouldn’t you regret having not also taken Box A? Not only would you have gotten a thousand dollars more, you’d also have shown up that know-it-all superintelligent intergalactic traveller too! That’s a chance that I’ll never have, since Omega will read my comment here and leave my Box B empty, but you might have that chance, and if so then I hope you’ll take it.
It’s not really too late then. Omega can predict what you’ll do between seeing the boxes, and choosing which to take. If this is going to include a decision to take one box, then Omega will put a million dollars in that box.
I will not regret taking only one box. It strikes me as inconsistent to regret acting as the person I most wish to be, and it seems clear that the person I most wish to be will take only one box; there is no room for approved regret.
If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin’s comment below). I agree that if causality doesn’t work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.
If known physics applies, then Omega can predict all it likes, but my actions after it has placed the boxes cannot affect that prediction. There is always the chance that it predicts that I will take both boxes but I take only Box B. There is even the chance that it will predict that I will take only Box B but I take both boxes. Nothing in the problem statement rules that out. It would be different if that were actually impossible for some reason.
I will not regret taking only one box.
I knew that you wouldn’t, of course, since you’re a one-boxer. And we two-boxers will not regret taking both boxes, even if we find Box B empty. Better $1000 than nothing, we will think!
If known physics applies, then Omega can predict all it likes, but my actions after it has placed the boxes cannot affect that prediction. There is always the chance that it predicts that I will take both boxes but I take only Box B. There is even the chance that it will predict that I will take only Box B but I take both boxes. Nothing in the problem statement rules that out. It would be different if that were actually impossible for some reason.
Ah, I see what the probem is. You have a confused notion of free will and what it means to make a choice.
Making a choice between two options doesn’t mean there is a real chance that you might take either option (there always is at least an infinitesimal chance, but that it always true even for things that are not usefully described as a choice). It just means that attributing the reason for your taking whatever option you take is most usefully attributed to you (and not e.g. gravity, government, the person holding a gun to you head etc.). In the end, though, it is (unless the choice is so close that random noise makes the difference) a fact about you that you will make the choice you will make. And it is in principle possible for others to discover this fact about you.
If it is a fact about you that you will one-box it is not possible that you will two-box. If it is a fact about you that you will two-box it is not possible that you will one-box. If it is a fact about you that you will leave the choice up to chance then Omega probably doesn’t offer you to take part in the first place.
Now, when deciding what choice to make it is usually most useful to pretend there is a real possibility of taking either option, since that generally causes facts about you that are more benefitial to you. And that you do that is just another fact about you, and influences the fact about which choice you make. Usually the fact which choice you will make has no consequences before you make your choice, and so you can model the rest of the world as being the same in either case up to that point when counterfactually considering the consequences of either choice. But the fact about which choice you will make is just another fact like any other, and is allowed, even if it usually doesn’t, to have consequences before that point in time. If it does it is best, for the very same reason you pretend that either choice is a real possibility in the first place, to also model the rest of the world as different contingent on your choice. That doesn’t mean backwards causality. Modeling the word in this way is just another fact about you that generates good outcomes.
It’s not really too late then. Omega can predict what you’ll do between seeing the boxes, and choosing which to take. If this is going to include a decision to take one box, then Omega will put a million dollars in that box.
TobyBartels:
If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin’s comment below). I agree that if causality doesn’t work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.
I remember reading an article about someone who sincerely lacked respect for people who were ‘soft’ (not exact quote) on the death penalty … before ending up on the jury of a death penalty case, and ultimately supporting life in prison instead. It is not inconceivable that a sufficiently canny analyst (e.g. Omega) could deduce that the process of being picked would motivate you to reconsider your stance. (Or, perhaps more likely, motivate a professed one-boxer like me to reconsider mine.)
If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin’s comment below). I agree that if causality doesn’t work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.
I don’t see what that link has to do with anything in my comment thread. (I haven’t read most of the other threads in reply to this post.)
I should explain what I mean by ‘causality’. I do not mean some metaphysical necessity, whereby every event (called an ‘effect’) is determined (or at least influenced in some asymmetric way) by other events (called its ‘causes’), which must be (or at least so far seem to be) prior to the effect in time, leading to infinite regress (apparently back to the Big Bang, which is somehow an exception). I do not mean anything that Aristotle knew enough physics to understand in any but the vaguest way.
I mean the flow of macroscopic entropy in a physical system.
The best reference that I know on the arrow of time is Huw Price’s 1996 book Time’s Arrow and Archimedes’ Point. But actually I didn’t understand how entropy flow leads to a physical concept of causality until several years after I read that, so that might not actually help, and I’m having no luck finding the Internet conversation that made it click for me.
But basically, I’m saying that, if known physics applies, then P(there is money in Box B|all information available on a macroscopic level when Omega placed the boxes) = P(there is money in Box B|all information … placed the boxes & I pick both boxes), even though P(I pick both boxes|all information … placed the boxes) < 1, because macroscopic entropy strictly increases between the placing of the boxes and the time that I finally pick a box.
So I need to be given evidence that known physics does not apply before I pick only Box B, and a successful record of predictions by Omega will not do that for me.
The Psychopath Button: Paul is debating whether to press the ‘kill all psychopaths’ button. It would,
he thinks, be much better to live in a world with no psychopaths. Unfortunately,
Paul is quite confident that only a psychopath would press such a button. Paul
very strongly prefers living in a world with psychopaths to dying. Should Paul
press the button? (Set aside your theoretical commitments and put yourself in
Paul’s situation. Would you press the button? Would you take yourself to be
irrational for not doing so?)
Newcomb’s Firebomb:
There are two boxes before you. Box A definitely contains $1,000,000. Box B definitely
contains $1,000. You have two choices: take only box A (call this one-boxing), or take
both boxes (call this two-boxing). You will signal your choice by pressing one of two
buttons. There is, as usual, an uncannily reliable predictor on the scene. If the predictor
has predicted that you will two-box, he has planted an incendiary bomb in box A, wired
to the two-box button, so that pressing the two-box button will cause the bomb to
detonate, burning up the $1,000,000. If the predictor has predicted that you will one-box,
no bomb has been planted – nothing untoward will happen, whichever button you press.
The predictor, again, is uncannily accurate.
I would suggest looking at your implicit choice of counterfactuals and their role in your decision theory. Standard causal decision theory involves local violations of the laws of physics (you assign probabilities to the world being such that you’ll one-box, or such that you’ll one-box, and then ask what miracle magically altering your decision, without any connection to your psychological dispositions, etc, would deliver the highest utility). Standard causal decision theory is a normative principle for action, that says to do the action that would deliver the most utility if a certain kind of miracle happened. But you can get different versions of causal decision theory by substituting different sorts of miracles, e.g. you can say: “if I one-box, then I have a psychology that one-boxes, and likewise for two-boxing” so you select the action such that a miracle giving you the disposition to do so earlier on would have been better. Yet another sort of counterfactual that can be hooked up to the causal decision theory framework would go “there’s some mathematical fact about what decision(decisions given Everett) my brain structure leads to in standard physics, and the predictor has access to this mathematical info, so I’ll select the action that would be best brought about by a miracle changing that mathematical fact”.
This is a global response to several replies within my little thread here, so I’ve put it at nearly the top level. Hopefully that works out OK.
I’m glad that FAWS brought up the probabilistic version. That’s because the greater the probability that Omega makes mistakes, the more inclined I am to take two boxes. I once read the claim that 70% of people, when told Newcomb’s Paradox in an experiment, claim to choose to take only one box. If this is accurate, then Omega can achieve a 70% level of accuracy by predicting that everybody is a one-boxer. Even if 70% is not accurate, you can still make the paradox work by adjusting the dollar amounts, as long as the bias is great enough that Omega can be confident that it will show up at all in the records of its past predictions. (To be fair, the proportion of two-boxers will probably rise as Omega’s accuracy falls, and changing the stakes should also affect people’s choices; there may not be a fixed point, although I expect that there is.)
If, in addition to the problem as stated (but with only 70% probability of success), I know that Omega always predicts one-boxing, then (hopefully) everybody agrees that I should take both boxes. There needs to some correlation between Omega’s predictions and the actual outcomes, not just a high proportion of past successes.
FAWS also writes:
You yourself claim to know what you would do in the boxing experiment
Actually, I don’t really want to make that claim. Although I’ve written things like ‘I would take both boxes’, I really should have written ‘I should take both boxes’. I’m stating a correct decision, not making a prediction about my actual actions. Right now, I predict about a 70% chance of two-boxing given the situation as stated in the original post, although I’ve never tried to calculate my estimates of probabilities, so who knows what that really means. (H’m, 70% again? Nope, I don’t trust that calibration at all!)
FAWS writes elsewhere:
Making a choice between two options […] just means that attributing the reason for your taking whatever option you take is most usefully attributed to you (and not e.g. gravity, government, the person holding a gun to you head etc.).
I don’t see what the gun has to do with it; this is a perfectly good problem in decision theory:
Suppose that you have a button that, if pressed, will trigger a bomb that kills two strangers on the other side of the world. I hold a gun to your head and threaten to shoot you if you don’t press the button. Should you press it?
A person who presses the button in that situation can reasonably say afterwards ‘I had no choice! Toby held a gun to my head!’, but that doesn’t invalidate the question. Such a person might even panic and make the question irrelevant, but it’s still a good question.
If it is a fact about you that you will leave the choice up to chance then Omega probably doesn’t offer you to take part in the first place.
So that’s how Omega gets such a good record! (^_^)
Understanding the question really is important. I’ve been interpreting it something along these lines: you interrupt your normal thought processes to go through a complete evaluation of the situation before you, then see what you do. (This is exactly what you cannot do if you panic in the gun problem above.) So perhaps we can predict with certain accuracy that an utter bigot will take one course of action, but that is not what the bigot should do, nor is it what they will do if they discard their prejudices and decide afresh.
Now that I think about it, I see some problems with this interpretation, and also some refinements that might fix it. (The first thing to do is to make it less dependent on the specific person making the decision.) But I’ll skip the refinements. It’s enough to notice that Omega might very well predict that a person will not take the time to think things through, so there is poor correlation between what one should do and what Omega will predict, even though the decision is based on what the world would be like if one did take the time.
I still think that (modulo refinements) this is a good interpretation of what most people would mean if they tell a story and then ask ‘What should this person do?’. (I can try to defend that claim if anybody still wants me to after they finish this comment.) In that case, I stand by my decision that one should take both boxes, at least if there is no good evidence of new physics.
However, I now realise that there is another interpretation, which is more practical, however much the ordinary person might not interpret things this way. That is: sit down and think through the whole situation now, long before you are ever faced with it in real life, and decide what to do. One obvious benefit of this is that when I hold a gun to your head, you won’t panic, because you will be prepared. More generally, this is what we are all actually doing right now! So as we make these idle philosophical musings, let’s be practical, and decide what we’ll do if Omega ever offers us this deal.
In this case, I agree that I will be better off (given the extremely unlikely but possible assumption that I am ever in this situation) if I have decided now to take only Box B. As RobinZ points out, I might change my mind later, but that can’t be helped (and to a certain extent shouldn’t be helped, since it’s best if I take two boxes after Omega predicts that I’ll only take one, but we can’t judge that extent if Omega is smarter than us, so really there’s no benefit to holding back at all).
If Omega is fallible, then the value of one-boxing falls drastically, and even adjusting the amount of money doesn’t help in the end; once Omega’s proportion of past success matches the observed proportion in experiments (or whatever our best guess of the actual proportion of real people is), then I’m back to two-boxing, since I expect that Omega simply always predicts one-boxing.
In hindsight, it’s obvious that the the original post was about decision in this sense, since Eliezer was talking about an AI that modifies its decision procedures in anticipation of facing Omega in the future. Similarly, we humans modify our decision procedures by making commitments and letting ourselves invent rationalisations for them afterwards (although the problem with this is that it makes it hard to change our minds when we receive new information). So obviously Eliezer wants us to decide now (or at least well ahead of time) and use our leet Methods of Rationality to keep the rationalisations in check.
So I hereby decide that I will pick only one box. (You hear that, Omega!?) Since I am honest (and strongly doubt that Omega exists), I’ll add that I may very well change my mind if this ever really happens, but that’s about what I would do, not what I should do. And in a certain sense, I should change my mind … then. But in another sense, I should (and do!) choose to be a one-boxer now.
(Thanks also to CarlShulman, whom I haven’t quoted, but whose comment was a big help in drawing my attention to the different senses of ‘should’, even though I didn’t really adopt his analysis of them.)
If Omega is fallible, then the value of one-boxing falls drastically, and even adjusting the amount of money doesn’t help in the end;
Assume Omega has a probability X of correctly predicting your decision:
If you choose to two-box:
X chance of getting $1000
(1-X) chance of getting $1,001,000
If you choose to take box B only:
X chance of getting $1,000,000
(1-X) chance of getting $0
Your expected utilities for two-boxing and one-boxing are (respectively):
E2 = 1000X + (1-X)1001000 E1 = 1000000X
For E2 > E1, we must have 1000X + 1,001,000 − 1,001,000X − 1,000,000X > 0, or 1,001,000 > 2,000,000X, or
X < 0.5005
So as long as Omega can maintain a greater than 50% accuracy, you should expect to earn more money by one-boxing. Since the solution seems so simple, and since I’m a total novice at decision theory, it’s possible I’m missing something here, so please let me know.
So as long as Omega can maintain a greater than 50% accuracy, you should expect to earn more money by one-boxing. Since the solution seems so simple, and since I’m a total novice at decision theory, it’s possible I’m missing something here, so please let me know.
Your caclulation is fine. What you’re missing is that Omega has a record of 70% accuracy because Omega always predicts that a person will one-box and 70% of people are one-boxers. So Omega always puts the million dollars in Box B, and I will always get $1,001,000$ if I’m one of the 30% of people who two-box.
At least, that is a possibility, which your calculation doesn’t take into account. I need evidence of a correlation between Omega’s predictions and the participants’ actual behaviour, not just evidence of correct predictions. My prior probability distribution for how often people one-box isn’t even concentrated very tightly around 70% (which is just a number that I remember reading once as the result of one survey), so anything short of a long run of predictions with very high proportion of correct ones will make me suspect that Omega is pulling a trick like this.
So the problem is much cleaner as Eliezer states it, with a perfect record. (But if even that record is short, I won’t buy it.)
Oops, I see that RobinZ already replied, and with calculations. This shows that I should still remove the word ‘drastically’ from the bit that nhamann quoted.
Wait—we can’t assume that the probability of being correct is the same for two-boxing and one-boxing. Suppose Omega has a probability X of predicting one when you choose one and Y of predicting one when you choose two.
E1 = E($1 000 000) * X
E2 = E($1 000) + E($1 000 000) * Y
The special case you list corresponds to Y = 1 - X, but in the general case, we can derive that E1 > E2 implies
X > Y + E($1 000) / E($1 000 000)
If we assume linear utility in wealth, this corresponds to a difference of 0.001. If, alternately, we choose a median net wealth of $93 100 (the U.S. figure) and use log-wealth as the measure of utility, the required difference increases to 0.004 or so. Either way, unless you’re dead broke (e.g. net wealth $1), you had better be extremely confident that you can fool the interrogator before you two-box.
You underestimate the meaning of superintelligence. One way of defining a superintelligence that wins at Newcomb without violating causality, is to assume that the universe is computer simulation like, such that it can be defined by a set of physical laws and a very long string of random numbers. If Omega knows the laws and random numbers that define the universe, shouldn’t Omega be able to predict your actions with 100% accuracy? And then wouldn’t you want to choose the action that results in you winning a lot more money?
So part of the definition of a superintelligence is that the universe is like that and Omega knows all that? In other words, if I have convincing evidence that Omega is superintelligent, then I must have convincing evidence that the universe is a computer simulation, etc? Then that changes things; just as the Second Law of Thermodynamics doesn’t apply to Maxwell’s Demon, so the law of forward causality (which is actually a consequence of the Second Law, under the assumption of no time travel) doesn’t apply to a superintelligence. So yes, then I would pick only Box B.
This just goes to show how important it is to understand exactly what the problem states.
The computer simulation assumption isn’t necessary, the only thing that matters is that Omega is transcendentally intelligent, and it has all the technology that you might imagine a post-Singularity intelligence might have (we’re talking Shock Level 4). So Omega scans your brain by using some technology that is effectively indistinguishable from magic, and we’re left to assume that it can predict, to a very high degree of accuracy, whether you’re the type of person who would take one or two boxes.
Omega doesn’t have to actually simulate your underlying physics, it just needs a highly accurate model, which seems reasonably easy to achieve for a superintelligence.
If its model is good enough that it violates the Second Law as we understand it, fine, I’ll pick only Box B, but I don’t see anything in the problem statement that implies this. The only evidence that I’m given is that it’s made a run of perfect predictions (of unknown length!), is smarter than us, and is from very far away. That’s not enough for new physics.
And just having a really good simulation of my brain, of the sort that we could imagine doing using known physics but just don’t have the technical capacity for, is definitely not good enough. That makes the probability that I’ll act as predicted very high, but I’ll still come out worse if, after the boxes have been set, I’m unlucky enough to only pick Box B anyway (or come out better if I’m lucky enough to pick both boxes anyway, if Omega pegs me for a one-boxer).
If its model is good enough that it violates the Second Law as we understand it [...]
It doesn’t have to be even remotely close to good enough to that for the scenario. I’d bet a sufficiently good human psychologist could take omega’s role and get it 90%+ right if he tests and interviews the people extensively first (without them knowing the purpose) and gets to exclude people he is unsure about. A super intelligent being should be far, far better at this.
You yourself claim to know what you would do in the boxing experiment, and you are an agent limited by conventional physics. There is no physical law that forbids another agent from knowing you as well as (or even better than) you know yourself.
You’ll have to explain why you think 99.99% (or whatever) is not good enough, a 0.01% chance to win $ 1000 shouldn’t make up for a 99.99% chance of losing $999,000.
I’m not reading 127 comments, but as a newcomer who’s been invited to read this page, along with barely a dozen others, as an introduction, I don’t want to leave this unanswered, even though what I have to say has probably already been said.
First of all, the answer to Newcomb’s Problem depends a lot on precisely what the problem is. I have seen versions that posit time travel, and therefore backwards causality. In that case, it’s quite reasonable to take only one box, because your decision to do so does have a causal effect on the amount in Box B. Presumably causal decision theorists would agree.
However, in any version of the problem where there is no clear evidence of violations of currently known physics and where the money has been placed by Omega before my decisions, I am a two-boxer. Yet I think that your post above must not be talking about the same problem that I am thinking of, especially at the end. Although you never said so, it seems to me that you must be talking about a problem which says “If you choose Box B, then it will have a million dollars; if you choose both boxes, then Box B will be empty.”. But that is simply not what the facts will be if Omega has made the decision in the past and currently understood physics applies. In the problem as stated, Omega may make mistakes in the future, and that makes all the difference.
It’s presumptuous of me to assume that you’re talking about a different problem from the one that you stated, I know. But as I read the psychological states that you suggest that I might have —that I might wish that I considered one-boxing rational, for example—, they seem utterly insane. Why would I wish such a thing? What does it have to do with anything? The only thing that I can wish for is that Omega has predicted that I will be a one-boxer, which has nothing to do with what I consider rational now.
The quotation from Joyce explains it well, up until the end, where poor phrasing may have confused you. The last sentence should read:
It is simply not true that Rachel envies Irene’s choice. Rachel envies Irene’s situation, the situation where there is a million dollars in Box B. And if Rachel were in that situation, then she would still take both boxes! (At least if I understand Joyce correctly.)
Possibly one thing that distinguishes me from one-boxers, and maybe even most two-boxers, is that I understand fundamental physics rather thoroughly and my prior has a very strong presumption against backwards causality. The mere fact that Omega has made successful predictions about Newcomb’s Paradox will never be enough to overrule that. Even being superintelligent and coming from another galaxy is not enough, although things change if Omega (known to be superintelligent and honest) claims to be a time-traveller. Perhaps for some one-boxers, and even for some irrational two-boxers, Omega’s past success at prediction is good evidence for backwards causality, but not for me.
So suppose that somebody puts two boxes down before me, presents convincing evidence for the situation as you stated it above (but no more), and goes away. Then I will simply take all of the money that this person has given me: both boxes. Before I open them, I will hope that they predicted that I will choose only one. After I open them, if I find Box B empty, then I will wish that they had predicted that I would choose only one. But I will not wish that I had chosen only one. And I certainly will not hope, beforehand, that I will choose only one and yet nevertheless choose two; that would indeed be irrational!
You are disposed to take two boxes. Omega can tell. (Perhaps by reading your comment. Heck, I can tell by reading your comment, and I’m not even a superintelligence.) Omega will therefore not put a million dollars in Box B if it sets you a Newcomb’s problem, because its decision to do so depends on whether you are disposed to take both boxes or not, and you are.
I am disposed to take one box. Omega can tell. (Perhaps by reading this comment. I bet you can tell by reading my comment, and I also bet that you’re not a superintelligence.) Omega will therefore put a million dollars in Box B if it sets me a Newcomb’s problem, because its decision to do so depends on whether I am disposed to take both boxes or not, and I’m not.
If we both get pairs of boxes to choose from, I will get a million dollars. You will get a thousand dollars. I will be monetarily better off than you.
But wait! You can fix this. All you have to do is be disposed to take just Box B. You can do this right now; there’s no reason to wait until Omega turns up. Omega does not care why you are so disposed, only that you are so disposed. You can mutter to yourself all you like about how silly the problem is; as long as you wander off with just B under your arm, it will tend to be the case that you end the day a millionaire.
Sometime ago I figured out a refutation of this kind of reasoning in Counterfactual Mugging, and it seems to apply in Newcomb’s Problem too. It goes as follows:
Imagine another god, Upsilon, that offers you a similar two-box setup—except to get the $2M in the box B, you must be a one-boxer with regard to Upsilon and a two-boxer with regard to Omega. (Upsilon predicts your counterfactual behavior if you’d met Omega instead.) Now you must choose your dispositions wisely because you can’t win money from both gods. The right disposition depends on your priors for encountering Omega or Upsilon, which is a “bead jar guess” because both gods are very improbable. In other words, to win in such problems, you can’t just look at each problem individually as it arises—you need to have the correct prior/predisposition over all possible predictors of your actions, before you actually meet any of them. Obtaining such a prior is difficult, so I don’t really know what I’m predisposed to do in Newcomb’s Problem if I’m faced with it someday.
Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I’m likely to encounter. Upsilon treats me on the basis of a guess I would subjunctively make without knowledge of Upsilon. It is therefore not surprising that I tend to do much better with Omega than with Upsilon, because the relevant choices being made by me are being made with much better knowledge. To put it another way, when Omega offers me a Newcomb’s Problem, I will condition my choice on the known existence of Omega, and all the Upsilon-like gods will tend to cancel out into Pascal’s Wagers. If I run into an Upsilon-like god, then, I am not overly worried about my poor performance—it’s like running into the Christian God, you’re screwed, but so what, you won’t actually run into one. Even the best rational agents cannot perform well on this sort of subjunctive hypothesis without much better knowledge while making the relevant choices than you are offering them. For every rational agent who performs well with respect to Upsilon there is one who performs poorly with respect to anti-Upsilon.
On the other hand, beating Newcomb’s Problem is easy, once you let go of the idea that to be “rational” means performing a strange ritual cognition in which you must only choose on the basis of physical consequences and not on the basis of correct predictions that other agents reliably make about you, so that (if you choose using this bizarre ritual) you go around regretting how terribly “rational” you are because of the correct predictions that others make about you. I simply choose on the basis of the correct predictions that others make about me, and so I do not regret being rational.
And these questions are highly relevant and realistic, unlike Upsilon; in the future we can expect there to be lots of rational agents that make good predictions about each other.
In what sense can you update? Updating is about following a plan, not about deciding on a plan. You already know that it’s possible to observe anything, you don’t learn anything new about environment by observing any given thing. There could be a deep connection between updating and logical uncertainty that makes it a good plan to update, but it’s not obvious what it is.
Huh? Updating is just about updating your map. (?) The next sentence I didn’t understand the reasoning of, could you expand?
Intuitively, the notion of updating a map of fixed reality makes sense, but in the context of decision-making, formalization in full generality proves elusive, even unnecessary, so far.
By making a choice, you control the truth value of certain statements—statements about your decision-making algorithm and about mathematical objects depending on your algorithm. Only some of these mathematical objects are part of the “real world”. Observations affect what choices you make (“updating is about following a plan”), but you must have decided beforehand what consequences you want to establish (“[updating is] not about deciding on a plan”). You could have decided beforehand to care only about mathematical structures that are “real”, but what characterizes those structures apart from the fact that you care about them?
Vladimir talks more about his crazy idea in this comment.
Pascal’s Wagers, huh. So your decision theory requires a specific prior?
This is not a refutation, because what you describe is not about the thought experiment. In the thought experiment, there are no Upsilons, and so nothing to worry about. It is if you face this scenario in real life, where you can’t be given guarantees about the absence of Upsilons, that your reasoning becomes valid. But it doesn’t refute the reasoning about the thought experiment where it’s postulated that there are no Upsilons.
(Original thread, my discussion.)
Thanks for dropping the links here. FWIW, I agree with your objection. But at the very least, the people claiming they’re “one-boxers” should also make the distinction you make.
Also, user Nisan tried to argue that various Upsilons and other fauna must balance themselves out if we use the universal prior. We eventually took this argument to email, but failed to move each other’s positions.
Just didn’t want you confusing people or misrepresenting my opinion, so made everything clear. :-)
OK. I assume the usual (Omega and Upsilon are both reliable and sincere, I can reliably distinguish one from the other, etc.)
Then I can’t see how the game doesn’t reduce to standard Newcomb, modulo a simple probability calculation, mostly based on “when I encounter one of them, what’s my probability of meeting the other during my lifetime?” (plus various “actuarial” calculations).
If I have no information about the probability of encountering either, then my decision may be incorrect—but there’s nothing paradoxical or surprising about this, it’s just a normal, “boring” example of an incomplete information problem.
I can’t see why that is—again, assuming that the full problem is explained to you on encountering either Upsilon or Omega, both are truhful, etc. Why can I not perform the appropriate calculations and make an expectation-maximising decision even after Upsilon-Omega has left? Surely Omega-Upsilon can predict that I’m going to do just that and act accordingly, right?
Yes, this is a standard incomplete information problem. Yes, you can do the calculations at any convenient time, not necessarily before meeting Omega. (These calculations can’t use the information that Omega exists, though.) No, it isn’t quite as simple as you state: when you meet Omega, you have to calculate the counterfactual probability of you having met Upsilon instead, and so on.
Something seems off about this, but I’m not sure what.
I’m pretty sure the logic is correct. I do make silly math mistakes sometimes, but I’ve tested this one on Vladimir Nesov and he agrees. No comment from Eliezer yet (this scenario was first posted to decision-theory-workshop).
It reminds me vaguely of Pascal’s Wager, but my cached responses thereunto are not translating informatively.
Then I think the original Newcomb’s Problem should remind you of Pascal’s Wager just as much, and my scenario should be analogous to the refutation thereof. (Thereunto? :-)
No, that’s not what I should do. What I should do is make Omega think that I am disposed to take just Box B. If I can successfully make Omega think that I’ll take only Box B but still take both boxes, then I should. But since Omega is superintelligent, let’s take it as understood that the only way to make Omega think that I’ll take only Box B is to make it so that I’ll actually take Box B. Then that is what I should do.
But I have to do it now! (I don’t do it now only because I don’t believe that this situation will ever happen.) Once Omega has placed the boxes and left, if the known laws of physics apply, then it’s too late!
If you take only Box B and get a million dollars, wouldn’t you regret having not also taken Box A? Not only would you have gotten a thousand dollars more, you’d also have shown up that know-it-all superintelligent intergalactic traveller too! That’s a chance that I’ll never have, since Omega will read my comment here and leave my Box B empty, but you might have that chance, and if so then I hope you’ll take it.
It’s not really too late then. Omega can predict what you’ll do between seeing the boxes, and choosing which to take. If this is going to include a decision to take one box, then Omega will put a million dollars in that box.
I will not regret taking only one box. It strikes me as inconsistent to regret acting as the person I most wish to be, and it seems clear that the person I most wish to be will take only one box; there is no room for approved regret.
If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin’s comment below). I agree that if causality doesn’t work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.
If known physics applies, then Omega can predict all it likes, but my actions after it has placed the boxes cannot affect that prediction. There is always the chance that it predicts that I will take both boxes but I take only Box B. There is even the chance that it will predict that I will take only Box B but I take both boxes. Nothing in the problem statement rules that out. It would be different if that were actually impossible for some reason.
I knew that you wouldn’t, of course, since you’re a one-boxer. And we two-boxers will not regret taking both boxes, even if we find Box B empty. Better $1000 than nothing, we will think!
Ah, I see what the probem is. You have a confused notion of free will and what it means to make a choice.
Making a choice between two options doesn’t mean there is a real chance that you might take either option (there always is at least an infinitesimal chance, but that it always true even for things that are not usefully described as a choice). It just means that attributing the reason for your taking whatever option you take is most usefully attributed to you (and not e.g. gravity, government, the person holding a gun to you head etc.). In the end, though, it is (unless the choice is so close that random noise makes the difference) a fact about you that you will make the choice you will make. And it is in principle possible for others to discover this fact about you.
If it is a fact about you that you will one-box it is not possible that you will two-box. If it is a fact about you that you will two-box it is not possible that you will one-box. If it is a fact about you that you will leave the choice up to chance then Omega probably doesn’t offer you to take part in the first place.
Now, when deciding what choice to make it is usually most useful to pretend there is a real possibility of taking either option, since that generally causes facts about you that are more benefitial to you. And that you do that is just another fact about you, and influences the fact about which choice you make. Usually the fact which choice you will make has no consequences before you make your choice, and so you can model the rest of the world as being the same in either case up to that point when counterfactually considering the consequences of either choice. But the fact about which choice you will make is just another fact like any other, and is allowed, even if it usually doesn’t, to have consequences before that point in time. If it does it is best, for the very same reason you pretend that either choice is a real possibility in the first place, to also model the rest of the world as different contingent on your choice. That doesn’t mean backwards causality. Modeling the word in this way is just another fact about you that generates good outcomes.
Alicorn:
TobyBartels:
I remember reading an article about someone who sincerely lacked respect for people who were ‘soft’ (not exact quote) on the death penalty … before ending up on the jury of a death penalty case, and ultimately supporting life in prison instead. It is not inconceivable that a sufficiently canny analyst (e.g. Omega) could deduce that the process of being picked would motivate you to reconsider your stance. (Or, perhaps more likely, motivate a professed one-boxer like me to reconsider mine.)
Beware hidden inferences. Taboo causality.
I don’t see what that link has to do with anything in my comment thread. (I haven’t read most of the other threads in reply to this post.)
I should explain what I mean by ‘causality’. I do not mean some metaphysical necessity, whereby every event (called an ‘effect’) is determined (or at least influenced in some asymmetric way) by other events (called its ‘causes’), which must be (or at least so far seem to be) prior to the effect in time, leading to infinite regress (apparently back to the Big Bang, which is somehow an exception). I do not mean anything that Aristotle knew enough physics to understand in any but the vaguest way.
I mean the flow of macroscopic entropy in a physical system.
The best reference that I know on the arrow of time is Huw Price’s 1996 book Time’s Arrow and Archimedes’ Point. But actually I didn’t understand how entropy flow leads to a physical concept of causality until several years after I read that, so that might not actually help, and I’m having no luck finding the Internet conversation that made it click for me.
But basically, I’m saying that, if known physics applies, then P(there is money in Box B|all information available on a macroscopic level when Omega placed the boxes) = P(there is money in Box B|all information … placed the boxes & I pick both boxes), even though P(I pick both boxes|all information … placed the boxes) < 1, because macroscopic entropy strictly increases between the placing of the boxes and the time that I finally pick a box.
So I need to be given evidence that known physics does not apply before I pick only Box B, and a successful record of predictions by Omega will not do that for me.
From Andy Egan.
I would suggest looking at your implicit choice of counterfactuals and their role in your decision theory. Standard causal decision theory involves local violations of the laws of physics (you assign probabilities to the world being such that you’ll one-box, or such that you’ll one-box, and then ask what miracle magically altering your decision, without any connection to your psychological dispositions, etc, would deliver the highest utility). Standard causal decision theory is a normative principle for action, that says to do the action that would deliver the most utility if a certain kind of miracle happened. But you can get different versions of causal decision theory by substituting different sorts of miracles, e.g. you can say: “if I one-box, then I have a psychology that one-boxes, and likewise for two-boxing” so you select the action such that a miracle giving you the disposition to do so earlier on would have been better. Yet another sort of counterfactual that can be hooked up to the causal decision theory framework would go “there’s some mathematical fact about what decision(decisions given Everett) my brain structure leads to in standard physics, and the predictor has access to this mathematical info, so I’ll select the action that would be best brought about by a miracle changing that mathematical fact”.
Thanks for the replies, everybody!
This is a global response to several replies within my little thread here, so I’ve put it at nearly the top level. Hopefully that works out OK.
I’m glad that FAWS brought up the probabilistic version. That’s because the greater the probability that Omega makes mistakes, the more inclined I am to take two boxes. I once read the claim that 70% of people, when told Newcomb’s Paradox in an experiment, claim to choose to take only one box. If this is accurate, then Omega can achieve a 70% level of accuracy by predicting that everybody is a one-boxer. Even if 70% is not accurate, you can still make the paradox work by adjusting the dollar amounts, as long as the bias is great enough that Omega can be confident that it will show up at all in the records of its past predictions. (To be fair, the proportion of two-boxers will probably rise as Omega’s accuracy falls, and changing the stakes should also affect people’s choices; there may not be a fixed point, although I expect that there is.)
If, in addition to the problem as stated (but with only 70% probability of success), I know that Omega always predicts one-boxing, then (hopefully) everybody agrees that I should take both boxes. There needs to some correlation between Omega’s predictions and the actual outcomes, not just a high proportion of past successes.
FAWS also writes:
Actually, I don’t really want to make that claim. Although I’ve written things like ‘I would take both boxes’, I really should have written ‘I should take both boxes’. I’m stating a correct decision, not making a prediction about my actual actions. Right now, I predict about a 70% chance of two-boxing given the situation as stated in the original post, although I’ve never tried to calculate my estimates of probabilities, so who knows what that really means. (H’m, 70% again? Nope, I don’t trust that calibration at all!)
FAWS writes elsewhere:
I don’t see what the gun has to do with it; this is a perfectly good problem in decision theory:
Suppose that you have a button that, if pressed, will trigger a bomb that kills two strangers on the other side of the world. I hold a gun to your head and threaten to shoot you if you don’t press the button. Should you press it?
A person who presses the button in that situation can reasonably say afterwards ‘I had no choice! Toby held a gun to my head!’, but that doesn’t invalidate the question. Such a person might even panic and make the question irrelevant, but it’s still a good question.
So that’s how Omega gets such a good record! (^_^)
Understanding the question really is important. I’ve been interpreting it something along these lines: you interrupt your normal thought processes to go through a complete evaluation of the situation before you, then see what you do. (This is exactly what you cannot do if you panic in the gun problem above.) So perhaps we can predict with certain accuracy that an utter bigot will take one course of action, but that is not what the bigot should do, nor is it what they will do if they discard their prejudices and decide afresh.
Now that I think about it, I see some problems with this interpretation, and also some refinements that might fix it. (The first thing to do is to make it less dependent on the specific person making the decision.) But I’ll skip the refinements. It’s enough to notice that Omega might very well predict that a person will not take the time to think things through, so there is poor correlation between what one should do and what Omega will predict, even though the decision is based on what the world would be like if one did take the time.
I still think that (modulo refinements) this is a good interpretation of what most people would mean if they tell a story and then ask ‘What should this person do?’. (I can try to defend that claim if anybody still wants me to after they finish this comment.) In that case, I stand by my decision that one should take both boxes, at least if there is no good evidence of new physics.
However, I now realise that there is another interpretation, which is more practical, however much the ordinary person might not interpret things this way. That is: sit down and think through the whole situation now, long before you are ever faced with it in real life, and decide what to do. One obvious benefit of this is that when I hold a gun to your head, you won’t panic, because you will be prepared. More generally, this is what we are all actually doing right now! So as we make these idle philosophical musings, let’s be practical, and decide what we’ll do if Omega ever offers us this deal.
In this case, I agree that I will be better off (given the extremely unlikely but possible assumption that I am ever in this situation) if I have decided now to take only Box B. As RobinZ points out, I might change my mind later, but that can’t be helped (and to a certain extent shouldn’t be helped, since it’s best if I take two boxes after Omega predicts that I’ll only take one, but we can’t judge that extent if Omega is smarter than us, so really there’s no benefit to holding back at all).
If Omega is fallible, then the value of one-boxing falls drastically, and even adjusting the amount of money doesn’t help in the end; once Omega’s proportion of past success matches the observed proportion in experiments (or whatever our best guess of the actual proportion of real people is), then I’m back to two-boxing, since I expect that Omega simply always predicts one-boxing.
In hindsight, it’s obvious that the the original post was about decision in this sense, since Eliezer was talking about an AI that modifies its decision procedures in anticipation of facing Omega in the future. Similarly, we humans modify our decision procedures by making commitments and letting ourselves invent rationalisations for them afterwards (although the problem with this is that it makes it hard to change our minds when we receive new information). So obviously Eliezer wants us to decide now (or at least well ahead of time) and use our leet Methods of Rationality to keep the rationalisations in check.
So I hereby decide that I will pick only one box. (You hear that, Omega!?) Since I am honest (and strongly doubt that Omega exists), I’ll add that I may very well change my mind if this ever really happens, but that’s about what I would do, not what I should do. And in a certain sense, I should change my mind … then. But in another sense, I should (and do!) choose to be a one-boxer now.
(Thanks also to CarlShulman, whom I haven’t quoted, but whose comment was a big help in drawing my attention to the different senses of ‘should’, even though I didn’t really adopt his analysis of them.)
Assume Omega has a probability X of correctly predicting your decision:
If you choose to two-box:
X chance of getting $1000
(1-X) chance of getting $1,001,000
If you choose to take box B only:
X chance of getting $1,000,000
(1-X) chance of getting $0
Your expected utilities for two-boxing and one-boxing are (respectively):
E2 = 1000X + (1-X)1001000
E1 = 1000000X
For E2 > E1, we must have 1000X + 1,001,000 − 1,001,000X − 1,000,000X > 0, or 1,001,000 > 2,000,000X, or
X < 0.5005
So as long as Omega can maintain a greater than 50% accuracy, you should expect to earn more money by one-boxing. Since the solution seems so simple, and since I’m a total novice at decision theory, it’s possible I’m missing something here, so please let me know.
Your caclulation is fine. What you’re missing is that Omega has a record of 70% accuracy because Omega always predicts that a person will one-box and 70% of people are one-boxers. So Omega always puts the million dollars in Box B, and I will always get $1,001,000$ if I’m one of the 30% of people who two-box.
At least, that is a possibility, which your calculation doesn’t take into account. I need evidence of a correlation between Omega’s predictions and the participants’ actual behaviour, not just evidence of correct predictions. My prior probability distribution for how often people one-box isn’t even concentrated very tightly around 70% (which is just a number that I remember reading once as the result of one survey), so anything short of a long run of predictions with very high proportion of correct ones will make me suspect that Omega is pulling a trick like this.
So the problem is much cleaner as Eliezer states it, with a perfect record. (But if even that record is short, I won’t buy it.)
Oops, I see that RobinZ already replied, and with calculations. This shows that I should still remove the word ‘drastically’ from the bit that nhamann quoted.
Wait—we can’t assume that the probability of being correct is the same for two-boxing and one-boxing. Suppose Omega has a probability X of predicting one when you choose one and Y of predicting one when you choose two.
The special case you list corresponds to Y = 1 - X, but in the general case, we can derive that E1 > E2 implies
If we assume linear utility in wealth, this corresponds to a difference of 0.001. If, alternately, we choose a median net wealth of $93 100 (the U.S. figure) and use log-wealth as the measure of utility, the required difference increases to 0.004 or so. Either way, unless you’re dead broke (e.g. net wealth $1), you had better be extremely confident that you can fool the interrogator before you two-box.
You underestimate the meaning of superintelligence. One way of defining a superintelligence that wins at Newcomb without violating causality, is to assume that the universe is computer simulation like, such that it can be defined by a set of physical laws and a very long string of random numbers. If Omega knows the laws and random numbers that define the universe, shouldn’t Omega be able to predict your actions with 100% accuracy? And then wouldn’t you want to choose the action that results in you winning a lot more money?
So part of the definition of a superintelligence is that the universe is like that and Omega knows all that? In other words, if I have convincing evidence that Omega is superintelligent, then I must have convincing evidence that the universe is a computer simulation, etc? Then that changes things; just as the Second Law of Thermodynamics doesn’t apply to Maxwell’s Demon, so the law of forward causality (which is actually a consequence of the Second Law, under the assumption of no time travel) doesn’t apply to a superintelligence. So yes, then I would pick only Box B.
This just goes to show how important it is to understand exactly what the problem states.
The computer simulation assumption isn’t necessary, the only thing that matters is that Omega is transcendentally intelligent, and it has all the technology that you might imagine a post-Singularity intelligence might have (we’re talking Shock Level 4). So Omega scans your brain by using some technology that is effectively indistinguishable from magic, and we’re left to assume that it can predict, to a very high degree of accuracy, whether you’re the type of person who would take one or two boxes.
Omega doesn’t have to actually simulate your underlying physics, it just needs a highly accurate model, which seems reasonably easy to achieve for a superintelligence.
If its model is good enough that it violates the Second Law as we understand it, fine, I’ll pick only Box B, but I don’t see anything in the problem statement that implies this. The only evidence that I’m given is that it’s made a run of perfect predictions (of unknown length!), is smarter than us, and is from very far away. That’s not enough for new physics.
And just having a really good simulation of my brain, of the sort that we could imagine doing using known physics but just don’t have the technical capacity for, is definitely not good enough. That makes the probability that I’ll act as predicted very high, but I’ll still come out worse if, after the boxes have been set, I’m unlucky enough to only pick Box B anyway (or come out better if I’m lucky enough to pick both boxes anyway, if Omega pegs me for a one-boxer).
It doesn’t have to be even remotely close to good enough to that for the scenario. I’d bet a sufficiently good human psychologist could take omega’s role and get it 90%+ right if he tests and interviews the people extensively first (without them knowing the purpose) and gets to exclude people he is unsure about. A super intelligent being should be far, far better at this.
You yourself claim to know what you would do in the boxing experiment, and you are an agent limited by conventional physics. There is no physical law that forbids another agent from knowing you as well as (or even better than) you know yourself.
You’ll have to explain why you think 99.99% (or whatever) is not good enough, a 0.01% chance to win $ 1000 shouldn’t make up for a 99.99% chance of losing $999,000.