Choosing is deliberation, deliberation is choosing. Just consider the alternatives (one-box, two-box) and do the one that results in you having more money.
Clearly thats two boxing. Omega already made his choice, so if he thought I’d two box, I’ll get;
-One box: nothing
-two boxing: the small reward
if Omega thought I’d one box:
-One box:big reward
-two box: big reward + small reward
Two boxing results in more money no matter how Omega thought I’d chose.
What if I try to predict what Omega does, and do the opposite?
That would mean that either 1) there are some strategies I am incapable of executing, or 2) Omega can’t in principle predict what I do, since it is indirectly predicting itself.
Alternatively, what if instead of me trying to predict Omega, we run this with transparent boxes and I base my decision on what I see in the boxes, doing the opposite of what Omega predicted? Again, Omega is indirectly predicting itself.
I don’t see how this is relevant, but yes, in principle it’s impossible to predict the universe perfectly. On account of the universe + your brain is bigger than your brain. Although, if you live in a bubble universe that is bigger than the rest of the universe, whose interaction with the rest of the universe is limited precisely to your chosen manipulation of the connecting bridge; basically, if you are AIXI, then you may be able to perfectly predict the universe conditional on your actions.
This has pretty much no impact on actual newcomb’s though, since we can just define such problems away by making omega do the obvious thing to prevent such shenanigans (“trolls get no money”). For the purpose of the thought experiment, action-conditional predictions are fine.
IOW, this is not a problem with Newcomb’s. By the way, this has been discussed previously.
You’ve now destroyed the usefulness of Newcomb as a potentially interesting analogy to the real world. In real world games, my opponent is trying to infer my strategy and I’m trying to infer theirs.
If Newcomb is only about a weird world where omega can try and predict the player’s actions, but the player is not allowed to predict omega’s, then its sort of a silly problem. Its lost most of its generality because you’ve explicitly disallowed the majority of strategies.
If you allow the player to pursue his own strategy, then its still a silly problem, because the question ends up being inconsistent (because if omega plays omega, nothing can happen).
In real world games, we spend most our time trying to make action-conditional predictions. “If I play Foo, then my opponent will play Bar”. There’s no attempting to circularly predict yourself with unconditional predictions. The sensible formulation of Newcomb’s matches that.
(For example, transparent boxes: Omega predicts “if I fill both boxes, then player will ___” and fills the boxes based on that prediction. Or a few other variations on that.)
In many (probably most?) games we consider the opponents strategy, not simply their next move. Making moves in an attempt to confuse your opponent’s estimation of your own strategy is a common tactic in many games.
Your “modified Newcomb” doesn’t allow the chooser to have a strategy- they aren’t allowed to say “if I predict Omega did X, I’ll do Y.” Its a weird sort of game where my opponent takes my strategy into account, but something keeps me from considering my opponents.
Can’t Omega follow the strategy of ‘Trolls get no money,’ which by assumption is worse for you? I feel like this would result in some false positives, but perhaps not—and the scenario says nothing about the people who don’t get to play in any case.
No, because that’s fighting the hypothetical. Assume that he doesn’t do that.
It is actually approximately the opposite of fighting the hypothetical. It is managing the people who are trying to fight the hypothetical. Precise wording of the details of the specification can be used to preempt such replies but for casual defininitions that assume good faith sometimes explicit clauses for the distracting edge cases need to be added.
It is fighting the hypothetical because you are not the only one providing hypotheticals. I am too; I’m providing a hypothetical where the player’s strategy makes this the least convenient possible world for people who claim that having such an Omega is a self-consistent concept. Saying “no, you can’t use that strategy” is fighting the hypothetical.
Moreover, the strategy “pick the opposite of what I predict Omega does” is a member of a class of strategies that have the same problem; it’s just an example of such a strategy that is particularly clear-cut, and the fact that it is clear-cut and blatantly demonstrates the problem with the scenario is the very aspect that leads you to call it trolling Omega. “You can’t troll Omega” becomes equivalent to “you can’t pick a strategy that makes the flaw in the scenario too obvious”.
If your goal is to show that Omega is “impossible” or “inconsistent”, then having Omega adopt the strategy “leave both boxes empty for people who try to predict me / do any other funny stuff” is a perfectly legitimate counterargument. It shows that Omega is in fact consistent if he adopts such strategy. You have no right to just ignore that counterargument.
Indeed, Omega requires a strategy for when he finds that you are too hard to predict. The only reason such a strategy is not provided beforehand in the default problem description is because we are not (in the context of developing decision theory) talking about situations where you are powerful enough to predict Omega, so such a specification would be redundant. The assumption, for the purpose of illuminating problems with classical decision theory, is that Omega has vastly more computational resources than you do, so that the difficult decision tree that presents the problem will obtain.
By the way, it is extremely normal for there to be strategies you are “incapable of executing”. For example, I am currently unable to execute the strategy “predict what you will say next, and counter it first”, because I can’t predict you. Computation is a resource like any other.
If your goal is to show that Omega is “impossible” or “inconsistent”, then having Omega adopt the strategy “leave both boxes empty for people who try to predict me / do any other funny stuff” is a perfectly legitimate counterargument.
If you are suggesting that Omega read my mind and think “does this human intend to outsmart me, Omega”, then sure he can do that. But that only takes care of the specific version of the strategy where the player has conscious intent.
If you’re suggesting “Omega figures out whether my strategy is functionally equivalent to trying to outsmart me”, you’re basically claiming that Omega can solve the halting problem by analyzing the situation to determine if it’s an instance of the halting problem, and outputting an appropriate answer if that is the case. That doesn’t work.
Indeed, Omega requires a strategy for when he finds that you are too hard to predict.
That still requires that he determine that I am too hard to predict, which either means solving the halting problem or running on a timer. Running on a timer is a legitimate answer, except again it means that there are some strategies I cannot execute.
The assumption, for the purpose of illuminating problems with classical decision theory, is that Omega has vastly more computational resources than you do, so that the difficult decision tree that presents the problem will obtain.
I thought the assumption is that I am a perfect reasoner and can execute any strategy.
except again it means that there are some strategies I cannot execute.
I don’t see how omega running his simulation on a timer makes any difference for this, but either way this is normal and expected. Problem resolved.
I thought the assumption is that I am a perfect reasoner and can execute any strategy.
Not at all. Though it may be convenient to postulate arbitrarily large computing power (as long as Omega’s power is increased to match) so that we can consider brute force algorithms instead of having to also worry about how to make it efficient.
(Actually, if you look at the decision tree for Newcomb’s, the intended options for your strategy are clearly supposed to be “unconditionally one-box” and “unconditionally two-box”, with potentially a mixed strategy allowed. Which is why you are provided wth no information whatsoever that would allow you to predict omega. And indeed the decision tree explicitly states that your state of knowledge is identical whether omega fills or doesn’t fill the box.)
I don’t see how omega running his simulation on a timer makes any difference for this,
It’s me who has to run on a timer. If I am only permitted to execute 1000 instructions to decide what my answer is, I may not be able to simulate Omega.
Though it may be convenient to postulate arbitrarily large computing power
Yes, I am assuming that I am capable of executing arbitrarily many instructions when computing my strategy.
the intended options for your strategy are clearly supposed to be “unconditionally one-box” and “unconditionally two-box”, with potentially a mixed strategy allowed. Which is why you are provided wth no information whatsoever that would allow you to predict omega
I know what problem Omega is trying to solve. If I am a perfect reasoner, and I know that Omega is, I should be able to predict Omega without actually having knowledge of Omega’s internals.
Actually, if you look at the decision tree for Newcomb’s, the intended options for your strategy are clearly supposed to be “unconditionally one-box” and “unconditionally two-box”,
Deciding which branch of the decision tree to pick is something I do using a process that has, as a step, simulating Omega. It is tempting to say “it doesn’t matter what process you use to choose a branch of the decision tree, each branch has a value that can be compared independently of why you chose the branch”, but that’s not correct. In the original problem, if I just compare the branches without considering Omega’s predictions, I should always two-box. If I consider Omega’s predictions, that cuts off some branches in a way which changes the relative ranking of the choices. If I consider my predictions of Omega’s predictions, that cuts off more branches, in a way which prevents the choices from even having a ranking.
Yes, I am assuming that I am capable of executing arbitrarily many instructions when computing my strategy.
But apparently you want to ignore the part when I said Omega has to have his own computing power increased to match. The fact that Omega is vastly more intelligent and computationally powerful than you is a fundamental premise of the problem. This is what stops you from magically “predicting him”.
Look, in Newcomb’s problem you are not supposed to be a “perfect reasoner” with infinite computing time or whatever. You are just a human. Omega is the superintelligence. So, any argument you make that is premised on being a perfect reasoner is automatically irrelevant and inapplicable. Do you have a point that is not based on this misunderstanding of the thought experiment? What is your point, even?
But apparently you want to ignore the part when I said Omega has to have his own computing power increased to match.
It’s already arbitrary large. You want that expanded to match arbitrarily large?
Look, in Newcomb’s problem you are not supposed to be a “perfect reasoner”
Asking “which box should you pick” implies that you can follow a chain of reasoning which outputs an answer about which box to pick.
It sounds like your decision making strategy fails to produce a useful result.
My decision making strategy is “figure out what Omega did and do the opposite”. It only fails to produce a useful result if Omega fails to produce a useful result (perhaps by trying to predict me and not halting). And Omega goes first, so we never get to the point where I try my decision strategy and don’t halt.
(And if you’re going to respond with “then Omega knows in advance that your decision strategy doesn’t halt”, how’s he going to know that?)
Furthermore, there’s always the transparent boxes situation. Instead of explicitly simulating Omega, I implicitly simulate Omega by looking in the transparent boxes and determining what Omega’s choice was.
What is your point, even?
That Omega cannot be a perfect predictor because being one no matter what strategy the human uses would imply being able to solve the halting problem.
It’s already arbitrary large. You want that expanded to match arbitrarily large?
When I say “arbitrarily large” I do not mean infinite. You have some fixed computing power, X (which you can interpret as “memory size” or “number of computations you can do before the sun explodes the next day” or whatever). The premise of newcomb’s is that Omega has some fixed computing power Q * X, where Q is really really extremely large. You can increase X as much as you like, as long as Omega is still Q times smarter.
Asking “which box should you pick” implies that you can follow a chain of reasoning which outputs an answer about which box to pick.
Which does not even remotely imply being a perfect reasoner. An ordinary human is capable of doing this just fine.
My decision making strategy is “figure out what Omega did and do the opposite”. It only fails to produce a useful result if Omega fails to produce a useful result (perhaps by trying to predict me and not halting).
Two points: If Omega’s memory is Q times large than yours, you can’t fit a simulation of him in your head. So predicting by simulation is not going to work. Second, If Omega has Q times as much computing time as you, you can try to predict him (by any method) for X steps, at which point the sun explodes. Naturally, Omega simulates you for X steps, notices that you didn’t give a result before the sun explodes, so leaves both boxes empty and flies away to safety.
That Omega cannot be a perfect predictor because being one no matter what strategy the human uses would imply being able to solve the halting problem.
Only under the artificial irrelevant-to-the-thought-experiment conditions that require him to care whether you’ll one-box or two-box after standing in front of the boxes for millions of years thinking about it. Whether or not the sun explodes, or Omega himself imposes a time limit, a realistic Omega only simulates for X steps, then stops. No halting-problem-solving involved.
In other words, if “Omega isn’t a perfect predictor” means that he can’t simulate a physical system for an infinite number of steps in finite time then I agree but don’t give a shit. Such a thing is entirely unneccessary. In the thought experiment, if you are a human, you die of aging after less than 100 years. And any strategy that involves you thinking in front of the boxes until you die of aging (or starvation, for that matter) is clearly flawed anyway.
Furthermore, there’s always the transparent boxes situation. Instead of explicitly simulating Omega, I implicitly simulate Omega by looking in the transparent boxes and determining what Omega’s choice was.
This example is less stupid since it is not based on trying to circularly predict yourself. But in this case Omega just makes action-conditional predictions and fills the boxes however he likes.
If I consider my predictions of Omega’s predictions, that cuts off more branches, in a way which prevents the choices from even having a ranking.
It sounds like your decision making strategy fails to produce a useful result. That is unfortunate for anyone who happens to attempt to employ it. You might consider changing it to something that works.
“Ha! What if I don’t choose One box OR Two boxes! I can choose No Boxes out of indecision instead!” isn’t a particularly useful objection.
No, Nshepperd is right. Omega imposing computation limits on itself solves the problem (such as it is). You can waste as much time as you like. Omega is gone and so doesn’t care whether you pick any boxes before the end of time. This is a standard solution for considering cooperation between bounded rational agents with shared source code.
When attempting to achieve mutual cooperation (essentially what Newcomblike problems are all about) making yourself difficult to analyse only helps against terribly naive intelligences. ie. It’s a solved problem and essentially useless for all serious decision theory discussion about cooperation problems.
If your goal is to show that Omega is “impossible” or “inconsistent”, then having Omega adopt the strategy “leave both boxes empty for people who try to predict me / do any other funny stuff” is a perfectly legitimate counterargument. It shows that Omega is in fact consistent if he adopts such strategy. You have no right to just ignore that counterargument.
This contradicts the accuracy stated at the beginning. Omega can’t leave both boxes empty for people who try to adopt a mixed strategy AND also maintain his 99.whatever accuracy on one-boxers.
And even if Omega has way more computational than I do, I can still generate a random number. I can flip a coin thats 60⁄40 one-box, two-box. The most accurate Omega can be, then, is to assume I one box.
This contradicts the accuracy stated at the beginning. Omega can’t leave both boxes empty for people who try to adopt a mixed strategy AND also maintain his 99.whatever accuracy on one-boxers.
He can maintain his 99% accuracy on deterministic one-boxers, which is all that matters for the hypothetical.
Alternatively, if we want to explicitly include mixed strategies as an available option, the general answer is that Omega fills the box with probability = the probability that your mixed strategy one-boxes.
All of this is very true, and I agree with it wholeheartedly. However, I think Jiro’s second scenario is more interesting, because then predicting Omega is not needed; you can see what Omega’s prediction was just by looking in (the now transparent) Box B.
As I argued in this comment, however, the scenario as it currently is is not well-specified; we need some idea of what sort of rule Omega is using to fill the boxes based on his prediction. I have not yet come up with a rule that would allow Omega to be consistent in such a scenario, though, and I’m not sure if consistency in this situation would even be possible for Omega. Any comments?
As I argued in this comment, however, the scenario as it currently is is not well-specified; we need some idea of what sort of rule Omega is using to fill the boxes based on his prediction.
Previous discussions of Transparent Newcomb’s problem have been well specified. I seem to recall doing so in footnotes so as to avoid distraction.
I have not yet come up with a rule that would allow Omega to be consistent in such a scenario, though, and I’m not sure if consistency in this situation would even be possible for Omega. Any comments?
The problem (such as it is) is that there is ambiguity between the possible coherent specifications, not a complete lack. As your comment points out there are (merely) two possible situations for the player to be in and Omega is able to counter-factually predict the response to either of them, with said responses limited to a boolean. That’s not a lot of permutations. You could specify all 4 exhaustively if you are lazy.
IF (Two box when empty AND One box when full) THEN X IF …
Any difficulty here is in choosing the set of rewards that most usefully illustrate the interesting aspects of the problem.
Any difficulty here is in choosing the set of rewards that most usefully illustrate the interesting aspects of the problem.
I’d say that about hits the nail on the head. The permutations certainly are exhaustively specifiable. The problem is that I’m not sure how to specify some of the branches. Here’s all four possibilities (written in pseudo-code following your example):
IF (Two box when empty And Two box when full) THEN X
IF (One box when empty And One box when full) THEN X
IF (Two box when empty And One box when full) THEN X
IF (One box when empty And Two box when full) THEN X
The rewards for 1 and 2 seem obvious; I’m having trouble, however, imagining what the rewards for 3 and 4 should be. The original Newcomb’s Problem had a simple point to demonstrate, namely that logical connections should be respected along with causal connections. This point was made simple by the fact that there’s two choices, but only one situation. When discussing transparent Newcomb, though, it’s hard to see how this point maps to the latter two situations in a useful and/or interesting way.
When discussing transparent Newcomb, though, it’s hard to see how this point maps to the latter two situations in a useful and/or interesting way.
Option 3 is of the most interest to me when discussing the Transparent variant. Many otherwise adamant One Boxers will advocate (what is in effect) 3 when first encountering the question. Since I advocate strategy 2 there is a more interesting theoretical disagreement. ie. From my perspective I get to argue with (literally) less-wrong wrong people, with a correspondingly higher chance that I’m the one who is confused.
The difference between 2 and 3 becomes more obviously relevant when noise is introduced (eg. 99% accuracy Omega). I choose to take literally nothing in some situations. Some think that is crazy...
In the simplest formulation the payoff for three is undetermined. But not undetermined in the sense that Omega’s proposal is made incoherent. Arbitrary as in Omega can do whatever the heck it wants and still construct a coherent narrative. I’d personally call that an obviously worse decision but for simplicity prefer to define 3 as a defect (Big Box Empty outcome).
As for 4… A payoff of both boxes empty (or both boxes full but contaminated with anthrax spores) seems fitting. But simply leaving the large box empty is sufficient for decision theoretic purposes.
Out of interest, and because your other comments on the subject seem well informed, what do you choose when you encounter Transparent Newcomb and find the big box empty?
what do you choose when you encounter Transparent Newcomb and find the big box empty?
This is a question that I find confusing due to conflicting intuitions. Fortunately, since I endorse reflective consistency, I can replace that question with the following one, which is equivalent in my decision framework, and which I find significantly less confusing:
“What would you want to precommit to doing, if you encountered transparent Newcomb and found the big box (a.k.a. Box B) empty?”
My answer to this question would be dependent upon Omega’s rule for rewarding players. If Omega only fills Box B if the player employs the strategy outlined in 2, then I would want to precommit to unconditional one-boxing—and since I would want to precommit to doing so, I would in fact do so. If Omega is willing to reward the player by filling Box B even if the player employs the strategy outlined in 3, then I would see nothing wrong with two-boxing, since I would have wanted to precommit to that strategy in advance. Personally, I find the former scenario—the one where Omega only rewards people who employ strategy 2--to be more in line with the original Newcomb’s Problem, for some intuitive reason that I can’t quite articulate.
What’s interesting, though, is that some people two-box even upon hearing that Omega only rewards the strategy outlined in 2--upon hearing, in other words, that they are in the first scenario described in the above paragraph. I would imagining that their reasoning process goes something like this: “Omega has left Box B empty. Therefore he has predicted that I’m going to two-box. It is extremely unlikely a priori that Omega is wrong in his predictions, and besides, I stand to gain nothing from one-boxing now. Therefore, I should two-box, both because it nets me more money and because Omega predicted that I would do so.”
I disagree with this line of reasoning, however, because it is very similar to the line of reasoning that leads to self-fulfilling prophecies. As a rule, I don’t do things just because somebody said I would do them, even if that somebody has a reputation for being extremely accurate, because then that becomes the only reason it happened in the first place. As with most situations involving acausal reasoning, however, I can only place so much confidence in me being correct, as opposed to me being so confused I don’t even realize I’m wrong.
It would seem to me that Omega’s actions would be as follows:
IF (Two box when empty And Two box when full) THEN Empty
IF (One box when empty And One box when full) THEN Full
IF (Two box when empty And One box when full) THEN Empty or Full
IF (One box when empty And Two box when full) THEN Refuse to present boxes
Cases 1 and 2 are straightforward. Case 3 works for the problem, no matter which set of boxes Omega chooses to leave.
In order for Omega to maintain its high prediction accuracy, though, it is necessary—if Omega predicts that a given player will choose option 4 - that Omega simply refuse to present the transparent boxes to this player. Or, at least, that the number of players who follow the other three options should vastly outnumber the fourth-option players.
This is an interesting response because 4 is basically what Jiro was advocating earlier in the thread, and you’re basically suggesting that Omega wouldn’t even present the opportunity to people who would try to do that. Would you agree with this interpretation of your comment?
If we take the assumption, for the moment, that the people who would take option 4 form at least 10% of the population in general (this may be a little low), and we further take the idea that Omega has a track record of success in 99% or more of previous trials (as is often specified in Newcomb-like problems), then it is clear that whatever algorithm Omega is using to decide who to present the boxes to is biased, and biased heavily, against offering the boxes to such a person.
Consider:
P(P) = The probability that Omega will present the boxes to a given person.
P(M|P) = The probability that Omega will fill the boxes correctly (empty for a two-boxer, full for a one-boxer)
P(M’|P) = The probability that Omega will fail to fill the boxes correctly
P(O) = The probability that the person will choose option 4
P (M’|O) = 1 (from the definition of option 4)
therefore P(M|O) = 0
and if Omega is a perfect predictor, then P(M|O’) = 1 as well.
P (M|P) = 0.99 (from the statement of the problem)
P (O) = 0.1 (assumed)
Now, of all the people to whom boxes are presented, Omega is only getting at most one percent wrong; P(M’|P) ⇐ 0.01. Since P(M’|O) = 1, and P(M’|O’)=0, it follows that P(P|O) ⇐ 0.01.
If Omega is a less than perfect predictor, then P(M’|O’)>0, and P(P|O)<0.01.
And, since P(P|O) = 0.01 < P(O) = 0.1, I therefore conclude that Omega must have a bias—and a fairly strong one—against presenting the boxes to such perverse players.
I am too; I’m providing a hypothetical where the player’s strategy makes this the least convenient possible world for people who claim that having such an Omega is a self-consistent concept.
It may be the least convenient possible world. More specifically it is the minor inconvenience of being careful to specify the problem correctly so as not to be distracted. Nshepperd gives some of the reason typically used in such cases.
Moreover, the strategy “pick the opposite of what I predict Omega does” is a member of a class of strategies that have the same problem
What happens when you try to pick the the opposite of what you predict Omega does is something like what happens when you try to beat Deep Fritz 14 at chess while outrunning a sports car. You just fail. Your brain is a few of pounds of fat approximately optimised for out-competing other primates for mating opportunities. Omega is a super-intelligence. The assumption that Omega is smarter than the player isn’t an unreasonable one and is fundamental to the problem. Defying it is a particularly futile attempt to fight the hypothetical by basically ignoring it.
Generalising your proposed class to executing maximally inconvenient behaviours in response to, for example, the transparent Newcomb’s problem is where it gets actually gets (tangentially) interesting. In that case you can be inconvenient without out-predicting the superintelligence and so the transparent Newcomb’s problem requires more care with the if clause.
In the first scenario, I doubt you would be able to predict Omega with sufficient accuracy to be able to do what you’re suggesting. Transparent boxes, though, are interesting. The problem is, the original Newcomb’s Problem had a single situation with two possible choices involved; tranparent Newcomb, however, involves two situations:
Transparent Box B contains $1000000.
Transparent Box B contains nothing.
It’s unclear from this what Omega is even trying to predict; is he predicting your response to the first situation? The second one? Both? Is he following the rule: “If the player two-boxes in either situation, fill Box B with nothing”? Is he following the rule: “If the player one-boxes in either situation, fill Box B with $1000000″? The problem isn’t well-specified; you’ll have to give a better description of the situation before a response can be given.
In the first scenario, I doubt you would be able to predict Omega with sufficient accuracy to be able to do what you’re suggesting.
That falls under 1) there are some strategies I am incapable of executing.
tranparent Newcomb, however, involves two situations:
Transparent Box B contains $1000000.
Transparent Box B contains nothing.
The transparent scenario is just a restatement of the opaque scenario with transparent boxes instead of “I predict what Omega does”. If you think the transparent scenario involves two situations, then the opaque scenario involves two situations as well. (1=opaque box B contains $1000000, and I predict that Omega put in $1000000 and 2=opaque box B contains nothing, and I predict that Omega puts in nothing.) If you object that we have no reason to think both of those opaque situations are possible, I can make a similar objection to the transparent situations.
Clearly thats two boxing. Omega already made his choice, so if he thought I’d two box, I’ll get;
-One box: nothing -two boxing: the small reward
if Omega thought I’d one box: -One box:big reward -two box: big reward + small reward
Two boxing results in more money no matter how Omega thought I’d chose.
Missing the Point: now a major motion picture.
Is that the drumbeat of nshepperd’s head against the desk that I hear..? :-D
What if I try to predict what Omega does, and do the opposite?
That would mean that either 1) there are some strategies I am incapable of executing, or 2) Omega can’t in principle predict what I do, since it is indirectly predicting itself.
Alternatively, what if instead of me trying to predict Omega, we run this with transparent boxes and I base my decision on what I see in the boxes, doing the opposite of what Omega predicted? Again, Omega is indirectly predicting itself.
I don’t see how this is relevant, but yes, in principle it’s impossible to predict the universe perfectly. On account of the universe + your brain is bigger than your brain. Although, if you live in a bubble universe that is bigger than the rest of the universe, whose interaction with the rest of the universe is limited precisely to your chosen manipulation of the connecting bridge; basically, if you are AIXI, then you may be able to perfectly predict the universe conditional on your actions.
This has pretty much no impact on actual newcomb’s though, since we can just define such problems away by making omega do the obvious thing to prevent such shenanigans (“trolls get no money”). For the purpose of the thought experiment, action-conditional predictions are fine.
IOW, this is not a problem with Newcomb’s. By the way, this has been discussed previously.
You’ve now destroyed the usefulness of Newcomb as a potentially interesting analogy to the real world. In real world games, my opponent is trying to infer my strategy and I’m trying to infer theirs.
If Newcomb is only about a weird world where omega can try and predict the player’s actions, but the player is not allowed to predict omega’s, then its sort of a silly problem. Its lost most of its generality because you’ve explicitly disallowed the majority of strategies.
If you allow the player to pursue his own strategy, then its still a silly problem, because the question ends up being inconsistent (because if omega plays omega, nothing can happen).
In real world games, we spend most our time trying to make action-conditional predictions. “If I play Foo, then my opponent will play Bar”. There’s no attempting to circularly predict yourself with unconditional predictions. The sensible formulation of Newcomb’s matches that.
(For example, transparent boxes: Omega predicts “if I fill both boxes, then player will ___” and fills the boxes based on that prediction. Or a few other variations on that.)
In many (probably most?) games we consider the opponents strategy, not simply their next move. Making moves in an attempt to confuse your opponent’s estimation of your own strategy is a common tactic in many games.
Your “modified Newcomb” doesn’t allow the chooser to have a strategy- they aren’t allowed to say “if I predict Omega did X, I’ll do Y.” Its a weird sort of game where my opponent takes my strategy into account, but something keeps me from considering my opponents.
Can’t Omega follow the strategy of ‘Trolls get no money,’ which by assumption is worse for you? I feel like this would result in some false positives, but perhaps not—and the scenario says nothing about the people who don’t get to play in any case.
No, because that’s fighting the hypothetical. Assume that he doesn’t do that.
It is actually approximately the opposite of fighting the hypothetical. It is managing the people who are trying to fight the hypothetical. Precise wording of the details of the specification can be used to preempt such replies but for casual defininitions that assume good faith sometimes explicit clauses for the distracting edge cases need to be added.
It is fighting the hypothetical because you are not the only one providing hypotheticals. I am too; I’m providing a hypothetical where the player’s strategy makes this the least convenient possible world for people who claim that having such an Omega is a self-consistent concept. Saying “no, you can’t use that strategy” is fighting the hypothetical.
Moreover, the strategy “pick the opposite of what I predict Omega does” is a member of a class of strategies that have the same problem; it’s just an example of such a strategy that is particularly clear-cut, and the fact that it is clear-cut and blatantly demonstrates the problem with the scenario is the very aspect that leads you to call it trolling Omega. “You can’t troll Omega” becomes equivalent to “you can’t pick a strategy that makes the flaw in the scenario too obvious”.
If your goal is to show that Omega is “impossible” or “inconsistent”, then having Omega adopt the strategy “leave both boxes empty for people who try to predict me / do any other funny stuff” is a perfectly legitimate counterargument. It shows that Omega is in fact consistent if he adopts such strategy. You have no right to just ignore that counterargument.
Indeed, Omega requires a strategy for when he finds that you are too hard to predict. The only reason such a strategy is not provided beforehand in the default problem description is because we are not (in the context of developing decision theory) talking about situations where you are powerful enough to predict Omega, so such a specification would be redundant. The assumption, for the purpose of illuminating problems with classical decision theory, is that Omega has vastly more computational resources than you do, so that the difficult decision tree that presents the problem will obtain.
By the way, it is extremely normal for there to be strategies you are “incapable of executing”. For example, I am currently unable to execute the strategy “predict what you will say next, and counter it first”, because I can’t predict you. Computation is a resource like any other.
If you are suggesting that Omega read my mind and think “does this human intend to outsmart me, Omega”, then sure he can do that. But that only takes care of the specific version of the strategy where the player has conscious intent.
If you’re suggesting “Omega figures out whether my strategy is functionally equivalent to trying to outsmart me”, you’re basically claiming that Omega can solve the halting problem by analyzing the situation to determine if it’s an instance of the halting problem, and outputting an appropriate answer if that is the case. That doesn’t work.
That still requires that he determine that I am too hard to predict, which either means solving the halting problem or running on a timer. Running on a timer is a legitimate answer, except again it means that there are some strategies I cannot execute.
I thought the assumption is that I am a perfect reasoner and can execute any strategy.
Why would this be the assumption?
There’s your answer.
I don’t see how omega running his simulation on a timer makes any difference for this, but either way this is normal and expected. Problem resolved.
Not at all. Though it may be convenient to postulate arbitrarily large computing power (as long as Omega’s power is increased to match) so that we can consider brute force algorithms instead of having to also worry about how to make it efficient.
(Actually, if you look at the decision tree for Newcomb’s, the intended options for your strategy are clearly supposed to be “unconditionally one-box” and “unconditionally two-box”, with potentially a mixed strategy allowed. Which is why you are provided wth no information whatsoever that would allow you to predict omega. And indeed the decision tree explicitly states that your state of knowledge is identical whether omega fills or doesn’t fill the box.)
It’s me who has to run on a timer. If I am only permitted to execute 1000 instructions to decide what my answer is, I may not be able to simulate Omega.
Yes, I am assuming that I am capable of executing arbitrarily many instructions when computing my strategy.
I know what problem Omega is trying to solve. If I am a perfect reasoner, and I know that Omega is, I should be able to predict Omega without actually having knowledge of Omega’s internals.
Deciding which branch of the decision tree to pick is something I do using a process that has, as a step, simulating Omega. It is tempting to say “it doesn’t matter what process you use to choose a branch of the decision tree, each branch has a value that can be compared independently of why you chose the branch”, but that’s not correct. In the original problem, if I just compare the branches without considering Omega’s predictions, I should always two-box. If I consider Omega’s predictions, that cuts off some branches in a way which changes the relative ranking of the choices. If I consider my predictions of Omega’s predictions, that cuts off more branches, in a way which prevents the choices from even having a ranking.
But apparently you want to ignore the part when I said Omega has to have his own computing power increased to match. The fact that Omega is vastly more intelligent and computationally powerful than you is a fundamental premise of the problem. This is what stops you from magically “predicting him”.
Look, in Newcomb’s problem you are not supposed to be a “perfect reasoner” with infinite computing time or whatever. You are just a human. Omega is the superintelligence. So, any argument you make that is premised on being a perfect reasoner is automatically irrelevant and inapplicable. Do you have a point that is not based on this misunderstanding of the thought experiment? What is your point, even?
It’s already arbitrary large. You want that expanded to match arbitrarily large?
Asking “which box should you pick” implies that you can follow a chain of reasoning which outputs an answer about which box to pick.
My decision making strategy is “figure out what Omega did and do the opposite”. It only fails to produce a useful result if Omega fails to produce a useful result (perhaps by trying to predict me and not halting). And Omega goes first, so we never get to the point where I try my decision strategy and don’t halt.
(And if you’re going to respond with “then Omega knows in advance that your decision strategy doesn’t halt”, how’s he going to know that?)
Furthermore, there’s always the transparent boxes situation. Instead of explicitly simulating Omega, I implicitly simulate Omega by looking in the transparent boxes and determining what Omega’s choice was.
That Omega cannot be a perfect predictor because being one no matter what strategy the human uses would imply being able to solve the halting problem.
When I say “arbitrarily large” I do not mean infinite. You have some fixed computing power, X (which you can interpret as “memory size” or “number of computations you can do before the sun explodes the next day” or whatever). The premise of newcomb’s is that Omega has some fixed computing power Q * X, where Q is really really extremely large. You can increase X as much as you like, as long as Omega is still Q times smarter.
Which does not even remotely imply being a perfect reasoner. An ordinary human is capable of doing this just fine.
Two points: If Omega’s memory is Q times large than yours, you can’t fit a simulation of him in your head. So predicting by simulation is not going to work. Second, If Omega has Q times as much computing time as you, you can try to predict him (by any method) for X steps, at which point the sun explodes. Naturally, Omega simulates you for X steps, notices that you didn’t give a result before the sun explodes, so leaves both boxes empty and flies away to safety.
Only under the artificial irrelevant-to-the-thought-experiment conditions that require him to care whether you’ll one-box or two-box after standing in front of the boxes for millions of years thinking about it. Whether or not the sun explodes, or Omega himself imposes a time limit, a realistic Omega only simulates for X steps, then stops. No halting-problem-solving involved.
In other words, if “Omega isn’t a perfect predictor” means that he can’t simulate a physical system for an infinite number of steps in finite time then I agree but don’t give a shit. Such a thing is entirely unneccessary. In the thought experiment, if you are a human, you die of aging after less than 100 years. And any strategy that involves you thinking in front of the boxes until you die of aging (or starvation, for that matter) is clearly flawed anyway.
This example is less stupid since it is not based on trying to circularly predict yourself. But in this case Omega just makes action-conditional predictions and fills the boxes however he likes.
It sounds like your decision making strategy fails to produce a useful result. That is unfortunate for anyone who happens to attempt to employ it. You might consider changing it to something that works.
“Ha! What if I don’t choose One box OR Two boxes! I can choose No Boxes out of indecision instead!” isn’t a particularly useful objection.
No, Nshepperd is right. Omega imposing computation limits on itself solves the problem (such as it is). You can waste as much time as you like. Omega is gone and so doesn’t care whether you pick any boxes before the end of time. This is a standard solution for considering cooperation between bounded rational agents with shared source code.
When attempting to achieve mutual cooperation (essentially what Newcomblike problems are all about) making yourself difficult to analyse only helps against terribly naive intelligences. ie. It’s a solved problem and essentially useless for all serious decision theory discussion about cooperation problems.
This contradicts the accuracy stated at the beginning. Omega can’t leave both boxes empty for people who try to adopt a mixed strategy AND also maintain his 99.whatever accuracy on one-boxers.
And even if Omega has way more computational than I do, I can still generate a random number. I can flip a coin thats 60⁄40 one-box, two-box. The most accurate Omega can be, then, is to assume I one box.
He can maintain his 99% accuracy on deterministic one-boxers, which is all that matters for the hypothetical.
Alternatively, if we want to explicitly include mixed strategies as an available option, the general answer is that Omega fills the box with probability = the probability that your mixed strategy one-boxes.
All of this is very true, and I agree with it wholeheartedly. However, I think Jiro’s second scenario is more interesting, because then predicting Omega is not needed; you can see what Omega’s prediction was just by looking in (the now transparent) Box B.
As I argued in this comment, however, the scenario as it currently is is not well-specified; we need some idea of what sort of rule Omega is using to fill the boxes based on his prediction. I have not yet come up with a rule that would allow Omega to be consistent in such a scenario, though, and I’m not sure if consistency in this situation would even be possible for Omega. Any comments?
Previous discussions of Transparent Newcomb’s problem have been well specified. I seem to recall doing so in footnotes so as to avoid distraction.
The problem (such as it is) is that there is ambiguity between the possible coherent specifications, not a complete lack. As your comment points out there are (merely) two possible situations for the player to be in and Omega is able to counter-factually predict the response to either of them, with said responses limited to a boolean. That’s not a lot of permutations. You could specify all 4 exhaustively if you are lazy.
IF (Two box when empty AND One box when full) THEN X
IF …
Any difficulty here is in choosing the set of rewards that most usefully illustrate the interesting aspects of the problem.
I’d say that about hits the nail on the head. The permutations certainly are exhaustively specifiable. The problem is that I’m not sure how to specify some of the branches. Here’s all four possibilities (written in pseudo-code following your example):
IF (Two box when empty And Two box when full) THEN X
IF (One box when empty And One box when full) THEN X
IF (Two box when empty And One box when full) THEN X
IF (One box when empty And Two box when full) THEN X
The rewards for 1 and 2 seem obvious; I’m having trouble, however, imagining what the rewards for 3 and 4 should be. The original Newcomb’s Problem had a simple point to demonstrate, namely that logical connections should be respected along with causal connections. This point was made simple by the fact that there’s two choices, but only one situation. When discussing transparent Newcomb, though, it’s hard to see how this point maps to the latter two situations in a useful and/or interesting way.
Option 3 is of the most interest to me when discussing the Transparent variant. Many otherwise adamant One Boxers will advocate (what is in effect) 3 when first encountering the question. Since I advocate strategy 2 there is a more interesting theoretical disagreement. ie. From my perspective I get to argue with (literally) less-wrong wrong people, with a correspondingly higher chance that I’m the one who is confused.
The difference between 2 and 3 becomes more obviously relevant when noise is introduced (eg. 99% accuracy Omega). I choose to take literally nothing in some situations. Some think that is crazy...
In the simplest formulation the payoff for three is undetermined. But not undetermined in the sense that Omega’s proposal is made incoherent. Arbitrary as in Omega can do whatever the heck it wants and still construct a coherent narrative. I’d personally call that an obviously worse decision but for simplicity prefer to define 3 as a defect (Big Box Empty outcome).
As for 4… A payoff of both boxes empty (or both boxes full but contaminated with anthrax spores) seems fitting. But simply leaving the large box empty is sufficient for decision theoretic purposes.
Out of interest, and because your other comments on the subject seem well informed, what do you choose when you encounter Transparent Newcomb and find the big box empty?
This is a question that I find confusing due to conflicting intuitions. Fortunately, since I endorse reflective consistency, I can replace that question with the following one, which is equivalent in my decision framework, and which I find significantly less confusing:
“What would you want to precommit to doing, if you encountered transparent Newcomb and found the big box (a.k.a. Box B) empty?”
My answer to this question would be dependent upon Omega’s rule for rewarding players. If Omega only fills Box B if the player employs the strategy outlined in 2, then I would want to precommit to unconditional one-boxing—and since I would want to precommit to doing so, I would in fact do so. If Omega is willing to reward the player by filling Box B even if the player employs the strategy outlined in 3, then I would see nothing wrong with two-boxing, since I would have wanted to precommit to that strategy in advance. Personally, I find the former scenario—the one where Omega only rewards people who employ strategy 2--to be more in line with the original Newcomb’s Problem, for some intuitive reason that I can’t quite articulate.
What’s interesting, though, is that some people two-box even upon hearing that Omega only rewards the strategy outlined in 2--upon hearing, in other words, that they are in the first scenario described in the above paragraph. I would imagining that their reasoning process goes something like this: “Omega has left Box B empty. Therefore he has predicted that I’m going to two-box. It is extremely unlikely a priori that Omega is wrong in his predictions, and besides, I stand to gain nothing from one-boxing now. Therefore, I should two-box, both because it nets me more money and because Omega predicted that I would do so.”
I disagree with this line of reasoning, however, because it is very similar to the line of reasoning that leads to self-fulfilling prophecies. As a rule, I don’t do things just because somebody said I would do them, even if that somebody has a reputation for being extremely accurate, because then that becomes the only reason it happened in the first place. As with most situations involving acausal reasoning, however, I can only place so much confidence in me being correct, as opposed to me being so confused I don’t even realize I’m wrong.
It would seem to me that Omega’s actions would be as follows:
IF (Two box when empty And Two box when full) THEN Empty
IF (One box when empty And One box when full) THEN Full
IF (Two box when empty And One box when full) THEN Empty or Full
IF (One box when empty And Two box when full) THEN Refuse to present boxes
Cases 1 and 2 are straightforward. Case 3 works for the problem, no matter which set of boxes Omega chooses to leave.
In order for Omega to maintain its high prediction accuracy, though, it is necessary—if Omega predicts that a given player will choose option 4 - that Omega simply refuse to present the transparent boxes to this player. Or, at least, that the number of players who follow the other three options should vastly outnumber the fourth-option players.
This is an interesting response because 4 is basically what Jiro was advocating earlier in the thread, and you’re basically suggesting that Omega wouldn’t even present the opportunity to people who would try to do that. Would you agree with this interpretation of your comment?
Yes, I would.
If we take the assumption, for the moment, that the people who would take option 4 form at least 10% of the population in general (this may be a little low), and we further take the idea that Omega has a track record of success in 99% or more of previous trials (as is often specified in Newcomb-like problems), then it is clear that whatever algorithm Omega is using to decide who to present the boxes to is biased, and biased heavily, against offering the boxes to such a person.
Consider:
P(P) = The probability that Omega will present the boxes to a given person.
P(M|P) = The probability that Omega will fill the boxes correctly (empty for a two-boxer, full for a one-boxer) P(M’|P) = The probability that Omega will fail to fill the boxes correctly
P(O) = The probability that the person will choose option 4
P (M’|O) = 1 (from the definition of option 4) therefore P(M|O) = 0
and if Omega is a perfect predictor, then P(M|O’) = 1 as well.
P (M|P) = 0.99 (from the statement of the problem)
P (O) = 0.1 (assumed)
Now, of all the people to whom boxes are presented, Omega is only getting at most one percent wrong; P(M’|P) ⇐ 0.01. Since P(M’|O) = 1, and P(M’|O’)=0, it follows that P(P|O) ⇐ 0.01.
If Omega is a less than perfect predictor, then P(M’|O’)>0, and P(P|O)<0.01.
And, since P(P|O) = 0.01 < P(O) = 0.1, I therefore conclude that Omega must have a bias—and a fairly strong one—against presenting the boxes to such perverse players.
It may be the least convenient possible world. More specifically it is the minor inconvenience of being careful to specify the problem correctly so as not to be distracted. Nshepperd gives some of the reason typically used in such cases.
What happens when you try to pick the the opposite of what you predict Omega does is something like what happens when you try to beat Deep Fritz 14 at chess while outrunning a sports car. You just fail. Your brain is a few of pounds of fat approximately optimised for out-competing other primates for mating opportunities. Omega is a super-intelligence. The assumption that Omega is smarter than the player isn’t an unreasonable one and is fundamental to the problem. Defying it is a particularly futile attempt to fight the hypothetical by basically ignoring it.
Generalising your proposed class to executing maximally inconvenient behaviours in response to, for example, the transparent Newcomb’s problem is where it gets actually gets (tangentially) interesting. In that case you can be inconvenient without out-predicting the superintelligence and so the transparent Newcomb’s problem requires more care with the if clause.
In the first scenario, I doubt you would be able to predict Omega with sufficient accuracy to be able to do what you’re suggesting. Transparent boxes, though, are interesting. The problem is, the original Newcomb’s Problem had a single situation with two possible choices involved; tranparent Newcomb, however, involves two situations:
Transparent Box B contains $1000000.
Transparent Box B contains nothing.
It’s unclear from this what Omega is even trying to predict; is he predicting your response to the first situation? The second one? Both? Is he following the rule: “If the player two-boxes in either situation, fill Box B with nothing”? Is he following the rule: “If the player one-boxes in either situation, fill Box B with $1000000″? The problem isn’t well-specified; you’ll have to give a better description of the situation before a response can be given.
That falls under 1) there are some strategies I am incapable of executing.
The transparent scenario is just a restatement of the opaque scenario with transparent boxes instead of “I predict what Omega does”. If you think the transparent scenario involves two situations, then the opaque scenario involves two situations as well. (1=opaque box B contains $1000000, and I predict that Omega put in $1000000 and 2=opaque box B contains nothing, and I predict that Omega puts in nothing.) If you object that we have no reason to think both of those opaque situations are possible, I can make a similar objection to the transparent situations.