I’m confused about Newcomb’s problem. Here is how I’m thinking about it.
Suppose that there are three points in time:
T1: Before you are confronted by Omega.
T2: You are confronted by Omega.
T3: After you are confronted by Omega.
If backwards causality is real and at T2 you can affect T1, then you should one-box in order to get the million dollars instead of the thousand dollars.
If at T2 you think there is a sufficient probability of being confronted by a similar Newcomb-like problem at T3 (some time in the future) where one-boxing at T2 will lead to better outcomes at T3, then you should one-box.
But if you somehow knew that a) backwards causality is impossible and b) you’d never be confronted by a Newcomb-like problem at T3, then I suppose you should two-box. Maybe this is tautological, but given (a) and (b) I don’t see how one-boxing would ever help you “win”.
Suppose that instead of you confronting Omega, I am confronting Omega, while you are watching the events from above, and you can choose to magically change my action to something different. That is, you press a button, and if you do, the neuron values in my brain get overridden such that I change my behavior from one-boxing to two-boxing. Nothing else changes, Omega already decided the contents of the boxes, so I walk away with more money.
This different problem is actually not different at all; it’s isomorphic to how you’ve just framed the problem. You’ve assumed that you can perform surgery on your own action without affecting anything else of the environment. And it’s not just you; this is how causal decision theory is formalized. All causal decision theory problems are isomorphic to the different problem where, instead of doing the experiment, you are an omnipotent observer of the experiment who can act on the agent’s brain from the outside.
In this transformed version of a decision problem, CDT is indeed optimal and you should two-box. But in the real world, you can’t do that. You’re not outside the problem looking in, capable of performing surgery on your brain to change your decision in such a way that nothing else is affected. You are a deterministic agent running a fixed program, and either your program performs well or it doesn’t. So the problem with “isn’t two-boxing better here?” isn’t that it’s false, it’s that it’s uninteresting because that’s not something you can do. You can’t choose your decision, you can only choose your program.
Hm, interesting. I think I see what you mean about the decision vs the program. I also think that is addressed by my points about T3 though.
By one-boxing at T2 you choose one-boxing (or a decision theory that leads to one-boxing) as the program you are running, and so if you are confronted by Omega (or something similar) at T3 they’ll put a million dollars in box B (or something similar). But if you happened to know that there would be no situations like this at T3, then installing the one-boxing program at T2 wouldn’t actually help you win.
Is it before or after Omega has decided on the contents of the box?
After.
If it’s after, then again you can’t change anything here. You’ll do whatever Omega predicted you’ll do.
I don’t see how that follows. As an analogy, suppose you knew that Omega predicted whether you’d eat salad or pizza for lunch but didn’t know Omega’s prediction. When lunch time rolls around, it’s still useful to think about whether you should eat salad or pizza right?
Let’s formalize it. Your decision procedure is some kind of algorithm p that can do arbitrary computational steps but is deterministic. To model “arbitrary computational steps”, let’s just think of it as a function outputting an arbitrary string representing thoughts or computations or whatnot. The only input to p is the time step since you don’t receive any other information. So in the toy model, p:{t1,t2,t3}→Σ∗. Also your output p(t2) must be such that it determines your decision, so we can define a predicate D that takes your thoughts p(t2) and looks whether you one-box or two-box.
Then the procedure of Omega that fills the opaque box, Ω, is just a function defined by the rule
Ω:p↦{100000 if D(p(t2))=one-box0 if D(p(t2))=two-box
So what Causal Decision Theory allows you to do (and what I feel like you’re still trying to do) is choose the output of p at time t2. But you can’t do this. What you can do is choose p, arbitrarily. You can choose it to always one-box, always two-box, to think “I will one-box” at time t1 and then two-box at time t2, etc. But you don’t get around the fact that every p such that D(p(t2))=one-box ends up with a million dollars and every p such that D(p(t2))=two-box ends up with 1000 dollars. Hence you should choose one of the former kinds of p.
(And yes, I realize that the formalism of letting you output an arbitrary string of thoughts is completely redundant since it just gets reduced to a binary choice anyway, but that’s kinda the point since the same is true for you in the experiment. It doesn’t matter whether you first decide to one-box before you eventually two-box; any choice that ends with you two-boxing is equivalent. The real culprit here is that the intuition of you choosing your action is so hard to get rid off.)
I’m not familiar with some of this notation but I’ll do my best.
It makes sense to me that if you can install a decision algorithm into yourself, at T0 let’s say, then you’d want to install one that one-boxes.
But I don’t think that’s the scenario in Newcomb’s Problem. From what I understand, in Newcomb’s Problem, you’re sitting there at T2, confronted by Omega, never having thought about any of this stuff before (let’s suppose). At that point you can come up with a decision algorithm. But T1 is already in the past, so whatever algorithm you come up with at T2 won’t actually affect what Omega predicts you’ll do (assuming no backwards causality).
From what I understand, in Newcomb’s Problem, you’re sitting there at T2, confronted by Omega, never having thought about any of this stuff before (let’s suppose). At that point you can come up with a decision algorithm.
With this sentence, you’re again putting yourself outside the experiment; you get a model where you-the-person-in-the-experiment is one thing inside the experiment, and you-the-agent is another thing sitting outside, choosing what your brain does.
But it doesn’t work that way. In the formalism, p describes your entire brain. (Which is the correct way to formalize it because Omega can look at your entire brain.) Your brain cannot step out of causality and decide to install a different algorithm. Your brain is entirely described by p, and it’s doing exactly what p does, which is also what Omega predicted.
If it helps, you can forget about the “decision algorithm” abstraction altogether. Your brain is a deterministic system; Omega predicted what it does at t2, it will do exactly that thing. You cannot decide to do something other than the deterministic output of your brain.
At the point of decision, T2, you want box B to have the million dollars. But Omega’s decision was made at T1. If you want to affect T1 from T2, it seems to me like you’d need backwards causality.
Omega’s decision at T2 (I don’t understand why you try to distinguish between T1 and T2; T1 seems irrelevant) is based on its prediction of your decision algorithm in Newcomb problems (including on what it predicts you’ll do at T3). It presents you with two boxes. And if it expects you to two-box at T3, then its box B is empty. What is timing supposed to change about this?
I’m confused about Newcomb’s problem. Here is how I’m thinking about it.
Suppose that there are three points in time:
T1: Before you are confronted by Omega.
T2: You are confronted by Omega.
T3: After you are confronted by Omega.
If backwards causality is real and at T2 you can affect T1, then you should one-box in order to get the million dollars instead of the thousand dollars.
If at T2 you think there is a sufficient probability of being confronted by a similar Newcomb-like problem at T3 (some time in the future) where one-boxing at T2 will lead to better outcomes at T3, then you should one-box.
But if you somehow knew that a) backwards causality is impossible and b) you’d never be confronted by a Newcomb-like problem at T3, then I suppose you should two-box. Maybe this is tautological, but given (a) and (b) I don’t see how one-boxing would ever help you “win”.
Suppose that instead of you confronting Omega, I am confronting Omega, while you are watching the events from above, and you can choose to magically change my action to something different. That is, you press a button, and if you do, the neuron values in my brain get overridden such that I change my behavior from one-boxing to two-boxing. Nothing else changes, Omega already decided the contents of the boxes, so I walk away with more money.
This different problem is actually not different at all; it’s isomorphic to how you’ve just framed the problem. You’ve assumed that you can perform surgery on your own action without affecting anything else of the environment. And it’s not just you; this is how causal decision theory is formalized. All causal decision theory problems are isomorphic to the different problem where, instead of doing the experiment, you are an omnipotent observer of the experiment who can act on the agent’s brain from the outside.
In this transformed version of a decision problem, CDT is indeed optimal and you should two-box. But in the real world, you can’t do that. You’re not outside the problem looking in, capable of performing surgery on your brain to change your decision in such a way that nothing else is affected. You are a deterministic agent running a fixed program, and either your program performs well or it doesn’t. So the problem with “isn’t two-boxing better here?” isn’t that it’s false, it’s that it’s uninteresting because that’s not something you can do. You can’t choose your decision, you can only choose your program.
Hm, interesting. I think I see what you mean about the decision vs the program. I also think that is addressed by my points about T3 though.
By one-boxing at T2 you choose one-boxing (or a decision theory that leads to one-boxing) as the program you are running, and so if you are confronted by Omega (or something similar) at T3 they’ll put a million dollars in box B (or something similar). But if you happened to know that there would be no situations like this at T3, then installing the one-boxing program at T2 wouldn’t actually help you win.
Maybe I am missing something though?
When exactly is t2? Is it before or after Omega has decided on the contents of the box?
If it’s before, then one-boxing is better. If it’s after, then again you can’t change anything here. You’ll do whatever Omega predicted you’ll do.
After.
I don’t see how that follows. As an analogy, suppose you knew that Omega predicted whether you’d eat salad or pizza for lunch but didn’t know Omega’s prediction. When lunch time rolls around, it’s still useful to think about whether you should eat salad or pizza right?
Let’s formalize it. Your decision procedure is some kind of algorithm p that can do arbitrary computational steps but is deterministic. To model “arbitrary computational steps”, let’s just think of it as a function outputting an arbitrary string representing thoughts or computations or whatnot. The only input to p is the time step since you don’t receive any other information. So in the toy model, p:{t1,t2,t3}→Σ∗. Also your output p(t2) must be such that it determines your decision, so we can define a predicate D that takes your thoughts p(t2) and looks whether you one-box or two-box.
Then the procedure of Omega that fills the opaque box, Ω, is just a function defined by the rule
Ω:p↦{100000 if D(p(t2))=one-box0 if D(p(t2))=two-box
So what Causal Decision Theory allows you to do (and what I feel like you’re still trying to do) is choose the output of p at time t2. But you can’t do this. What you can do is choose p, arbitrarily. You can choose it to always one-box, always two-box, to think “I will one-box” at time t1 and then two-box at time t2, etc. But you don’t get around the fact that every p such that D(p(t2))=one-box ends up with a million dollars and every p such that D(p(t2))=two-box ends up with 1000 dollars. Hence you should choose one of the former kinds of p.
(And yes, I realize that the formalism of letting you output an arbitrary string of thoughts is completely redundant since it just gets reduced to a binary choice anyway, but that’s kinda the point since the same is true for you in the experiment. It doesn’t matter whether you first decide to one-box before you eventually two-box; any choice that ends with you two-boxing is equivalent. The real culprit here is that the intuition of you choosing your action is so hard to get rid off.)
I’m not familiar with some of this notation but I’ll do my best.
It makes sense to me that if you can install a decision algorithm into yourself, at T0 let’s say, then you’d want to install one that one-boxes.
But I don’t think that’s the scenario in Newcomb’s Problem. From what I understand, in Newcomb’s Problem, you’re sitting there at T2, confronted by Omega, never having thought about any of this stuff before (let’s suppose). At that point you can come up with a decision algorithm. But T1 is already in the past, so whatever algorithm you come up with at T2 won’t actually affect what Omega predicts you’ll do (assuming no backwards causality).
With this sentence, you’re again putting yourself outside the experiment; you get a model where you-the-person-in-the-experiment is one thing inside the experiment, and you-the-agent is another thing sitting outside, choosing what your brain does.
But it doesn’t work that way. In the formalism, p describes your entire brain. (Which is the correct way to formalize it because Omega can look at your entire brain.) Your brain cannot step out of causality and decide to install a different algorithm. Your brain is entirely described by p, and it’s doing exactly what p does, which is also what Omega predicted.
If it helps, you can forget about the “decision algorithm” abstraction altogether. Your brain is a deterministic system; Omega predicted what it does at t2, it will do exactly that thing. You cannot decide to do something other than the deterministic output of your brain.
I found Joe Carlsmith’s post to be super helpful on this. Especially his discussion of the perfect deterministic identical twin prisoner’s dilemma.
Omega is a nigh-perfect predictor: “Omega has put a million dollars in box B iff Omega has predicted that you will take only box B.”
So if you follow the kind of decision algorithm that would make you two-box, box B will be empty.
How do concepts like backwards causality make any difference here?
At the point of decision, T2, you want box B to have the million dollars. But Omega’s decision was made at T1. If you want to affect T1 from T2, it seems to me like you’d need backwards causality.
Omega’s decision at T2 (I don’t understand why you try to distinguish between T1 and T2; T1 seems irrelevant) is based on its prediction of your decision algorithm in Newcomb problems (including on what it predicts you’ll do at T3). It presents you with two boxes. And if it expects you to two-box at T3, then its box B is empty. What is timing supposed to change about this?