I think the fundamental conceptual problem with Newcomb’s paradox is that it basically says, “Assume that Joe’s choice causes the box to have money in it, but it doesn’t ‘cause’ the box to have money in it.” Causation is necessary; the hypothetical just black-boxes it and says we can’t call it “causation.” This doublethink predictably leads to a great deal of confusion, which makes us dissect causality and generate analyses like these even though the problem seems to be essentially linguistic.
Edit for clarity: This is an objection to the framing of Newcomb’s itself, not to the specific treatment of causation in the article. I explain in response to a response below, but it seems to me that Newcomb’s requires doublethink with respect to the concept of causation, and that this doublethink makes the problem useless.
I freely admit that the problem may still be above my pay grade at this point, but your comment does accurately describe my dissatisfaction with some handlings of Newcomb’s problem I’ve seen in rationalist circles. It’s like they want the decision to have everything we recognize as “causing”, but not call it that.
Perhaps it would help to repeat an analogy someone made a while back here (I think it was PhilGoetz). It’s a mapping from Newcomb’s problem to the issue of revenge:
Have the disposition to one-box --> Have the disposition to take revenge (and vice versa)
Omega predicts you’ll one-box --> People deem you the type to take revenge (perhaps at great personal cost)
You look under the sealed box --> You find out how people treat you
You actually one-box --> You actually take revenge
The mapping isn’t perfect—people don’t have Omega-like predictive powers—but it’s close enough, since people can do much better than chance.
What happens when I one-box and find nothing? Well, as is permitted in some versions, Omega made a rare mistake, and its model of me didn’t show me one-boxing.
What happens when I’m revenge-oriented, but people cheat me on deals? Well, they guessed wrong, as could Omega. But you can see how the intention has causal influence, which ends once the “others” make their irreversible choice. Taking revenge doesn’t undo those acts, but it may prevent future ones.
Apologies if I’ve missed a discussion which has beaten this issue to death, which I probably have. Indeed, that was the complaint when (I think) PhilGoetz brought it up.
Update: PhilGoetz was the one who gave me the idea, in what was a quite reviled top-level post. But interestingly enough, in that thread, Eliezer_Yudkowsky said that his decision theory would have him take revenge, and for the same reason that he would one-box!
And here’s my remark showing my appreciation for PhilGoetz’s insight at the time. And under-rated post on his part, I think...
Hmm. I’m not trying to “black box” causation but to understand the code that leads us and others to impute causation, in the special case where we ourselves are “causing” things by our “choices”. I mean, what computations are we humans actually doing, when we figure out what parts of the world we can “cause” to be different? What similar agents would we find it useful to design, and what would they think they could cause, by their choices?
I’ll spell this out more in my next post, but I’m calling a “Could/Would/Should Agent”, or CSA, anything that:
Regards itself as having a certain set of labeled “choices”, which it “could” take;
Has a model of what the world “would” be like, and what its expected payoff would be, if it took each given choice;
Decides that it “should” take whichever choice has highest expected-payoff (according to its model), and in fact takes that choice.
There is more than one way to set up such a CSA. In particular, there is more than one way we could set up a CSA to model what “would” (counterfactually) happen “if it one-boxed” vs. “if it two-boxed”. That is why I am discussing counterfactuals.
I don’t fully follow where you’re coming from. Are you saying cause is an essential building-block that I should take as basic, and use, rather than trying to take cause apart? What do you mean when you say the problem “seems to be essentially linguistic”?
It’s an objection to Newcomb’s specifically, not cause or decision theory generally. My position may be a bit too complex for a comment, but here’s the gist.
Newcomb’s assumes that deciding A will result in universe X, and deciding B will result in universe Y. It uses the black box of Omega’s prediction process to forbid us from calling the connection causal, thus preventing CDT from working, but it requires that our decision be causal, because if it weren’t there would be no reason not to two-box. Thus, it assumes causation but prohibits us from calling it causation. If we actually understood how our choosing to pick up the opaque box would result in it being empty, the problem would be entirely trivial. Thus, Newcomb’s disproves CDT by assuming causation-that-is-not-causation, and such an assumption does not seem to actually prove anything about the world.
The smoking lesion problem has the same flaw in reverse. It requires EDT to assume that Susan’s choice is relevant to whether she gets cancer, but it also assumes that Susan’s choice is not relevant to her getting cancer. This linguistic doublethink is all that makes the problem difficult.
In Newcomb’s, a full understanding of how Omega’s prediction works should make the problem trivial, because it could be incorporated into CDT. If we don’t assume that it does work, the problem doesn’t work; there’s no reason not to use CDT if Omega can’t predict systematically. In the Smoking Lesion, a proper understanding of the cocorrelate that actually does cause cancer would make the problem doable in EDT, since it would be obvious that her chance of getting cancer is independent of her choice to smoke. If we don’t assume that such a cocorrelate exists the problem doesn’t work; EDT says Susan shouldn’t smoke, which basically makes sense if the correlation has a meaningful chance of being causal. This is what I mean by it’s a linguistic problem; language allows us to express these examples with no apparent contradiction, but the contradiction is there if we break it down far enough.
What if we ran a contest of decision theories on Newcomb’s problem in a similar fashion to Axelrod’s test of iterated PD programs? I (as Omega) would ask you to submit an explicit deterministic program X that’s going to face a gauntlet of simple decision theory problems (including some Newcomb-like problems), and the payoffs it earns will be yours at the end.
In this case, I don’t think you’d care (for programming purposes) whether I analyze X mathematically to figure out whether it 1- or 2-boxes, or whether I run simulations of X to see what it does, or anything else, so long as you have confidence that I will accurately predict X’s choices (and play honestly as Omega). And I’m confident that if the payoffs are large enough to matter to you, you will not submit a CDT program or any 2-boxing program.
So it seems to me that the ‘linguistic confusion’ you face might have more to do with the way your current (actual, horrendously complicated) decision process feels from inside than with an inherent contradiction in Newcomb’s Problem.
going to face a gauntlet of simple decision theory problems (including some Newcomb-like problems)
This is the issue. I suspect that Newcomb-like problems aren’t meaningfully possible. Once you “explain” the problem to a machine, its choice actually causes the box to be full or empty. Omega’s prediction functions as causation-without-being-causal, which makes some sense to our minds, but does not seem like it would be understood by a machine. In other words, the reason CDT does not work for a machine is because you have the inputs wrong, not the algorithm. A machine that interpreted information correctly would understand its actions as causal even if it didn’t know how they did so, because it’s a key assumption of the problem that they are functionally causal. If the program does not have that key assumption available to it, it should rationally two box, so it’s totally unsurprising that prohibiting it from “understanding” the causal power of its decision results in it making the wrong decision.
Your counterexample is also problematic because I understand your prediction mechanism; I know how you will analyze my program, though there’s some small chance you’ll read the code wrong and come to the wrong conclusion, much like there’s some chance Omega gets it wrong. Thus, there’s a directly apparent causal connection between the program’s decision to one-box and you putting the money in that box. CDT thus appears to work, since “program one-boxes” directly causes one-boxing to be the correct strategy. In order to make CDT not work, you’d need to arbitrarily prevent the program from incorporating this fact. And, if I were really, really smart (and if I cared enough), I’d design a program that you would predict would one-box, but actually two-boxed when you put it to the test. That is the winningest strategy possible (if it is actually possible); the only reason we never consider it with Omega is because it’s assumed it wouldn’t work.
At this moment, I agree with Psychohistorian that the apparent conundrum is a result of forcing a distinction about causality when there really isn’t one.
On the one hand, we say that the contents of the boxes are not directly, causally related to our choice to one box or two box. (We assert this, I suppose, because of the separation in time between the events, where the boxes are filled before we make our choice.)
On the other hand, we say that Omega can predict with great accuracy what we choose. This implies two things: our decision algorithm for making the choice is pre-written and deterministic, and Omega has access to our decision making algorithm.
Omega bases the contents of the box on the output of our decision making algorithm (that he simulates at time (t-y)) so the contents of the box are directly, causally related to the output of our decision algorithm.
Seems wrong to say that the contents of the box are not causally related to the output of our decision algorithm at time t (i.e., our choice), but are causally related to the output of the decision algorithm at time (t-y) -- even though the decision algorithm is deterministic and hasn’t changed.
In a deterministic system in which information isn’t lost as time progresses, then the time separation between events (positive or negative) makes no difference to the causality … “a causes b” if b depends on a (even if b happens before a). For example, afternoon rain will cause me to bring my umbrella in the morning, in an information-complete system.
Later edit: This represents the location in {comment space}-time where (I think) I’ve understood the solution to Newcomb’s problem, in the context of the substantial clues found here on LW. I had another comment in this thread explaining my solution that I’ve deleted. I don’t want to distract from Anna’s sequence (and I predict the usual philosophical differences) but I’ve kept my deleted comment in case there are more substantial differences.
I would say that the ambiguity/double think about causality is actually the feature of Newcomb’s problem that helps us reduce what causality is.
Of all the comments in this block, byrnema’s seems the most on-track, having the most ingredients of the solution, in my view. A few points:
I prefer to suppose that Omega has a powerful, detailed model of the local world, or whatever parts of the universe are ultimately factors in Joe’s decision. It isn’t just the contents of Joe’s brain. Omega’s track record is strong evidence that his model takes enough into account.
I do not see any backwards-in-time causality in this problem at all. That Joe’s state causes both Omega’s prediction and Joe’s choice is not the same as the choice causing the prediction.
In fact, that’s what seems wrong to me about most of the other comments right here. People keep talking about the choice causing something, but the problem says nothing about this at all. Joe’s choice doesn’t need to cause anything. Instead, Joe’s choice and Omega’s (prediction->money-hiding) have common causes.
The way I see it, the sleight-of-hand in this problem occurs when we ask what Joe “should” do. I think focusing on Joe’s choice leads people to imagine that the choice is free in the sense of being unconnected to Omega’s prediction (since the prediction has already happened). But it is not unconnected, because our choices are not un-caused. Neither are they connected backwards-in-time. Omega’s actions and Joe’s choice are connected because they share common causes.
EDIT: To make this a bit more concrete: Make this a question of what you “should” do if you meet Omega someday. Consider that your decision might be highly influenced by all the musings on the blog, or on Eliezer’s or another poster’s arguments. If these arguments convince you that you should one-box, then they also cause Omega to predict that you’ll one-box. If these arguments fail to convince you, then that circumstance also causes Omega to predict you will two-box.
You’ve got to resist thinking of the machinery of human decision-making as primary or transcendent. See Thou Art Physics.
This represents the location in {comment space}-time where (I think) I’ve understood the solution to Newcomb’s problem, in the context of the substantial clues found here on LW. I had another comment in this thread explaining my solution that I’ve deleted. I don’t want to distract from Anna’s sequence
I’d say go ahead and distract. I’d love to see your solution.
It seems problematic that the boxes are already filled and you are now deciding whether to one box or two box; that your decision can cause the contents of the box to be one way or the other. But it’s not really like that, because your decision has already been determined, even though you don’t know what the decision will be and are even still writing your decision algorithm.
So you are currently in the process of writing your decision algorithm, based on this situation you find yourself in where you need to make a decision. It feels free to you, but of course it isn’t because your “choices” while making the decision algorithm are controlled by other-level functions which are themselves deterministic. These other-level functions “should” compute that their decisions will affect / have affected the outcome and decide to one-box. By “should” I mean that this is best for the decider, but he has no control over how the decision algorithm actually gets written or what the outcome is.
The person that one-boxes will get more money than the person that two-boxes. Since the person has no real choice in the matter, regarding which sort of algorithm they are, you can only say that one is better than the other for getting the most money in this situation, as “should” would imply some kind of free will.
Suppose there is a mutation that causes one species of plants to be hardier than those without the mutation. We would say that one set of plants is better than the other at being hardy. We wouldn’t say that a plant “should” be the hardy variety—they have no choice in the matter.
We are just like plants. Some of us have a better set of states than others for success in whatever. One of these states is an algorithm that computes what to do (makes “choices”) based on predicted consequences of anticipated actions. When we run this algorithm and compute the “chosen action”, it feels on the inside like we are making a choice. It feels like a choice because we input multiple possible actions, and then the algorithm outputs a single action. But really we just happen to have an algorithm that outputs that choice. The algorithm can certainly be modified over time, but the self-modification is deterministic and depends on the environment.
Less Wrong is an environment where we are making the algorithm reflect upon itself. Everyone should consider the possibilities (have an algorithm that one boxes or two boxes?), realize that the one-box algorithm is the one that makes them rich, and update their algorithm accordingly.
Or more precisely: Everyone will consider the possibilities (have an algorithm that one boxes or two boxes?), may realize that having the one-box algorithm is the one that makes them rich, depending on their states, and may update their algorithm accordingly, again depending on their states. People generally have good algorithms, so I predict they will do this unless they have a reason not to. What is that reason?
To me so far, it does sound like a linguistic problem with the word “should”.
Does “should” mean what Joe should do to optimize his situation regardless of what is possible? (Then Joe should plan to one-box, but ultimately two-box via magical surgery where someone swoops in and changes his brain.)
Does “should” mean what Joe should do to optimize his situation restricted to what is possible?
Using the first meaning of should, Joe should two box. (He should if he could trick Omega; too bad he can’t.) Using the second meaning, Joe should one box.
The second meaning is more natural to me. (Choices must be made based on real world conditions.) But I can see how the first meaning would be more natural to others. (What is really optimal?)
This seems correct to me, and I don’t think it has any significant disagreement with AnnaSalamon’s analysis above.
Personally, Newcomb’s seems to me as though it’s designed to generate this sort of confusion, in order to seem interesting or ‘deep’ - much like mysticism.
I suggest that what is happening here is that Omega establishes a causal link with the agent’s decision to 1- or 2-box. Consider an analogy: You happen upon a group of crystals, each of which is anchored to the ground and grows upward with a complicated internal structure. Each is generally cylindrical to a minimum length, whereafter it continues in a helical fashion. Some crystals have a right-handed helix and others are left-handed. You, for reasons of your own, determine which are left-handed and at a point just below the start of the helix mark them, perhaps with a drop of ink.
Omega has done nothing more than this. His mark is the contents of the opaque box.
What the agent “should” do is to 1-box… that is, to turn left at the start of its helix… because that is the “moment” at which causality kicks in. No doublethink required.
I think the fundamental conceptual problem with Newcomb’s paradox is that it basically says, “Assume that Joe’s choice causes the box to have money in it, but it doesn’t ‘cause’ the box to have money in it.” Causation is necessary; the hypothetical just black-boxes it and says we can’t call it “causation.” This doublethink predictably leads to a great deal of confusion, which makes us dissect causality and generate analyses like these even though the problem seems to be essentially linguistic.
Edit for clarity: This is an objection to the framing of Newcomb’s itself, not to the specific treatment of causation in the article. I explain in response to a response below, but it seems to me that Newcomb’s requires doublethink with respect to the concept of causation, and that this doublethink makes the problem useless.
I freely admit that the problem may still be above my pay grade at this point, but your comment does accurately describe my dissatisfaction with some handlings of Newcomb’s problem I’ve seen in rationalist circles. It’s like they want the decision to have everything we recognize as “causing”, but not call it that.
Perhaps it would help to repeat an analogy someone made a while back here (I think it was PhilGoetz). It’s a mapping from Newcomb’s problem to the issue of revenge:
Have the disposition to one-box --> Have the disposition to take revenge (and vice versa)
Omega predicts you’ll one-box --> People deem you the type to take revenge (perhaps at great personal cost)
You look under the sealed box --> You find out how people treat you
You actually one-box --> You actually take revenge
The mapping isn’t perfect—people don’t have Omega-like predictive powers—but it’s close enough, since people can do much better than chance.
What happens when I one-box and find nothing? Well, as is permitted in some versions, Omega made a rare mistake, and its model of me didn’t show me one-boxing.
What happens when I’m revenge-oriented, but people cheat me on deals? Well, they guessed wrong, as could Omega. But you can see how the intention has causal influence, which ends once the “others” make their irreversible choice. Taking revenge doesn’t undo those acts, but it may prevent future ones.
Apologies if I’ve missed a discussion which has beaten this issue to death, which I probably have. Indeed, that was the complaint when (I think) PhilGoetz brought it up.
Update: PhilGoetz was the one who gave me the idea, in what was a quite reviled top-level post. But interestingly enough, in that thread, Eliezer_Yudkowsky said that his decision theory would have him take revenge, and for the same reason that he would one-box!
And here’s my remark showing my appreciation for PhilGoetz’s insight at the time. And under-rated post on his part, I think...
Hmm. I’m not trying to “black box” causation but to understand the code that leads us and others to impute causation, in the special case where we ourselves are “causing” things by our “choices”. I mean, what computations are we humans actually doing, when we figure out what parts of the world we can “cause” to be different? What similar agents would we find it useful to design, and what would they think they could cause, by their choices?
I’ll spell this out more in my next post, but I’m calling a “Could/Would/Should Agent”, or CSA, anything that:
Regards itself as having a certain set of labeled “choices”, which it “could” take;
Has a model of what the world “would” be like, and what its expected payoff would be, if it took each given choice;
Decides that it “should” take whichever choice has highest expected-payoff (according to its model), and in fact takes that choice.
There is more than one way to set up such a CSA. In particular, there is more than one way we could set up a CSA to model what “would” (counterfactually) happen “if it one-boxed” vs. “if it two-boxed”. That is why I am discussing counterfactuals.
I don’t fully follow where you’re coming from. Are you saying cause is an essential building-block that I should take as basic, and use, rather than trying to take cause apart? What do you mean when you say the problem “seems to be essentially linguistic”?
It’s an objection to Newcomb’s specifically, not cause or decision theory generally. My position may be a bit too complex for a comment, but here’s the gist.
Newcomb’s assumes that deciding A will result in universe X, and deciding B will result in universe Y. It uses the black box of Omega’s prediction process to forbid us from calling the connection causal, thus preventing CDT from working, but it requires that our decision be causal, because if it weren’t there would be no reason not to two-box. Thus, it assumes causation but prohibits us from calling it causation. If we actually understood how our choosing to pick up the opaque box would result in it being empty, the problem would be entirely trivial. Thus, Newcomb’s disproves CDT by assuming causation-that-is-not-causation, and such an assumption does not seem to actually prove anything about the world.
The smoking lesion problem has the same flaw in reverse. It requires EDT to assume that Susan’s choice is relevant to whether she gets cancer, but it also assumes that Susan’s choice is not relevant to her getting cancer. This linguistic doublethink is all that makes the problem difficult.
In Newcomb’s, a full understanding of how Omega’s prediction works should make the problem trivial, because it could be incorporated into CDT. If we don’t assume that it does work, the problem doesn’t work; there’s no reason not to use CDT if Omega can’t predict systematically. In the Smoking Lesion, a proper understanding of the cocorrelate that actually does cause cancer would make the problem doable in EDT, since it would be obvious that her chance of getting cancer is independent of her choice to smoke. If we don’t assume that such a cocorrelate exists the problem doesn’t work; EDT says Susan shouldn’t smoke, which basically makes sense if the correlation has a meaningful chance of being causal. This is what I mean by it’s a linguistic problem; language allows us to express these examples with no apparent contradiction, but the contradiction is there if we break it down far enough.
What if we ran a contest of decision theories on Newcomb’s problem in a similar fashion to Axelrod’s test of iterated PD programs? I (as Omega) would ask you to submit an explicit deterministic program X that’s going to face a gauntlet of simple decision theory problems (including some Newcomb-like problems), and the payoffs it earns will be yours at the end.
In this case, I don’t think you’d care (for programming purposes) whether I analyze X mathematically to figure out whether it 1- or 2-boxes, or whether I run simulations of X to see what it does, or anything else, so long as you have confidence that I will accurately predict X’s choices (and play honestly as Omega). And I’m confident that if the payoffs are large enough to matter to you, you will not submit a CDT program or any 2-boxing program.
So it seems to me that the ‘linguistic confusion’ you face might have more to do with the way your current (actual, horrendously complicated) decision process feels from inside than with an inherent contradiction in Newcomb’s Problem.
This is the issue. I suspect that Newcomb-like problems aren’t meaningfully possible. Once you “explain” the problem to a machine, its choice actually causes the box to be full or empty. Omega’s prediction functions as causation-without-being-causal, which makes some sense to our minds, but does not seem like it would be understood by a machine. In other words, the reason CDT does not work for a machine is because you have the inputs wrong, not the algorithm. A machine that interpreted information correctly would understand its actions as causal even if it didn’t know how they did so, because it’s a key assumption of the problem that they are functionally causal. If the program does not have that key assumption available to it, it should rationally two box, so it’s totally unsurprising that prohibiting it from “understanding” the causal power of its decision results in it making the wrong decision.
Your counterexample is also problematic because I understand your prediction mechanism; I know how you will analyze my program, though there’s some small chance you’ll read the code wrong and come to the wrong conclusion, much like there’s some chance Omega gets it wrong. Thus, there’s a directly apparent causal connection between the program’s decision to one-box and you putting the money in that box. CDT thus appears to work, since “program one-boxes” directly causes one-boxing to be the correct strategy. In order to make CDT not work, you’d need to arbitrarily prevent the program from incorporating this fact. And, if I were really, really smart (and if I cared enough), I’d design a program that you would predict would one-box, but actually two-boxed when you put it to the test. That is the winningest strategy possible (if it is actually possible); the only reason we never consider it with Omega is because it’s assumed it wouldn’t work.
At this moment, I agree with Psychohistorian that the apparent conundrum is a result of forcing a distinction about causality when there really isn’t one.
On the one hand, we say that the contents of the boxes are not directly, causally related to our choice to one box or two box. (We assert this, I suppose, because of the separation in time between the events, where the boxes are filled before we make our choice.)
On the other hand, we say that Omega can predict with great accuracy what we choose. This implies two things: our decision algorithm for making the choice is pre-written and deterministic, and Omega has access to our decision making algorithm.
Omega bases the contents of the box on the output of our decision making algorithm (that he simulates at time (t-y)) so the contents of the box are directly, causally related to the output of our decision algorithm.
Seems wrong to say that the contents of the box are not causally related to the output of our decision algorithm at time t (i.e., our choice), but are causally related to the output of the decision algorithm at time (t-y) -- even though the decision algorithm is deterministic and hasn’t changed.
In a deterministic system in which information isn’t lost as time progresses, then the time separation between events (positive or negative) makes no difference to the causality … “a causes b” if b depends on a (even if b happens before a). For example, afternoon rain will cause me to bring my umbrella in the morning, in an information-complete system.
Later edit: This represents the location in {comment space}-time where (I think) I’ve understood the solution to Newcomb’s problem, in the context of the substantial clues found here on LW. I had another comment in this thread explaining my solution that I’ve deleted. I don’t want to distract from Anna’s sequence (and I predict the usual philosophical differences) but I’ve kept my deleted comment in case there are more substantial differences.
I would say that the ambiguity/double think about causality is actually the feature of Newcomb’s problem that helps us reduce what causality is.
Of all the comments in this block, byrnema’s seems the most on-track, having the most ingredients of the solution, in my view. A few points:
I prefer to suppose that Omega has a powerful, detailed model of the local world, or whatever parts of the universe are ultimately factors in Joe’s decision. It isn’t just the contents of Joe’s brain. Omega’s track record is strong evidence that his model takes enough into account.
I do not see any backwards-in-time causality in this problem at all. That Joe’s state causes both Omega’s prediction and Joe’s choice is not the same as the choice causing the prediction.
In fact, that’s what seems wrong to me about most of the other comments right here. People keep talking about the choice causing something, but the problem says nothing about this at all. Joe’s choice doesn’t need to cause anything. Instead, Joe’s choice and Omega’s (prediction->money-hiding) have common causes.
The way I see it, the sleight-of-hand in this problem occurs when we ask what Joe “should” do. I think focusing on Joe’s choice leads people to imagine that the choice is free in the sense of being unconnected to Omega’s prediction (since the prediction has already happened). But it is not unconnected, because our choices are not un-caused. Neither are they connected backwards-in-time. Omega’s actions and Joe’s choice are connected because they share common causes.
EDIT: To make this a bit more concrete: Make this a question of what you “should” do if you meet Omega someday. Consider that your decision might be highly influenced by all the musings on the blog, or on Eliezer’s or another poster’s arguments. If these arguments convince you that you should one-box, then they also cause Omega to predict that you’ll one-box. If these arguments fail to convince you, then that circumstance also causes Omega to predict you will two-box.
You’ve got to resist thinking of the machinery of human decision-making as primary or transcendent. See Thou Art Physics.
I’d say go ahead and distract. I’d love to see your solution.
How about if I send you my solution as a message? You can let me know if I’m on the right track or not...
So should you one box or two box?
It seems problematic that the boxes are already filled and you are now deciding whether to one box or two box; that your decision can cause the contents of the box to be one way or the other. But it’s not really like that, because your decision has already been determined, even though you don’t know what the decision will be and are even still writing your decision algorithm.
So you are currently in the process of writing your decision algorithm, based on this situation you find yourself in where you need to make a decision. It feels free to you, but of course it isn’t because your “choices” while making the decision algorithm are controlled by other-level functions which are themselves deterministic. These other-level functions “should” compute that their decisions will affect / have affected the outcome and decide to one-box. By “should” I mean that this is best for the decider, but he has no control over how the decision algorithm actually gets written or what the outcome is.
The person that one-boxes will get more money than the person that two-boxes. Since the person has no real choice in the matter, regarding which sort of algorithm they are, you can only say that one is better than the other for getting the most money in this situation, as “should” would imply some kind of free will.
Suppose there is a mutation that causes one species of plants to be hardier than those without the mutation. We would say that one set of plants is better than the other at being hardy. We wouldn’t say that a plant “should” be the hardy variety—they have no choice in the matter.
We are just like plants. Some of us have a better set of states than others for success in whatever. One of these states is an algorithm that computes what to do (makes “choices”) based on predicted consequences of anticipated actions. When we run this algorithm and compute the “chosen action”, it feels on the inside like we are making a choice. It feels like a choice because we input multiple possible actions, and then the algorithm outputs a single action. But really we just happen to have an algorithm that outputs that choice. The algorithm can certainly be modified over time, but the self-modification is deterministic and depends on the environment.
Less Wrong is an environment where we are making the algorithm reflect upon itself. Everyone should consider the possibilities (have an algorithm that one boxes or two boxes?), realize that the one-box algorithm is the one that makes them rich, and update their algorithm accordingly.
Or more precisely: Everyone will consider the possibilities (have an algorithm that one boxes or two boxes?), may realize that having the one-box algorithm is the one that makes them rich, depending on their states, and may update their algorithm accordingly, again depending on their states. People generally have good algorithms, so I predict they will do this unless they have a reason not to. What is that reason?
To me so far, it does sound like a linguistic problem with the word “should”.
Does “should” mean what Joe should do to optimize his situation regardless of what is possible? (Then Joe should plan to one-box, but ultimately two-box via magical surgery where someone swoops in and changes his brain.)
Does “should” mean what Joe should do to optimize his situation restricted to what is possible?
Using the first meaning of should, Joe should two box. (He should if he could trick Omega; too bad he can’t.) Using the second meaning, Joe should one box.
The second meaning is more natural to me. (Choices must be made based on real world conditions.) But I can see how the first meaning would be more natural to others. (What is really optimal?)
This seems correct to me, and I don’t think it has any significant disagreement with AnnaSalamon’s analysis above.
Personally, Newcomb’s seems to me as though it’s designed to generate this sort of confusion, in order to seem interesting or ‘deep’ - much like mysticism.
I suggest that what is happening here is that Omega establishes a causal link with the agent’s decision to 1- or 2-box. Consider an analogy: You happen upon a group of crystals, each of which is anchored to the ground and grows upward with a complicated internal structure. Each is generally cylindrical to a minimum length, whereafter it continues in a helical fashion. Some crystals have a right-handed helix and others are left-handed. You, for reasons of your own, determine which are left-handed and at a point just below the start of the helix mark them, perhaps with a drop of ink. Omega has done nothing more than this. His mark is the contents of the opaque box. What the agent “should” do is to 1-box… that is, to turn left at the start of its helix… because that is the “moment” at which causality kicks in. No doublethink required.