Re: “Some concluding chiding of those philosophers who blithely decided that the “rational” course of action systematically loses”
Some of those philosophers draw a distinction between rational action and the actions of a rational agent—see here:
I conclude that the rational action for a player in the Newcomb Paradox is taking both boxes, but that rational agents will usually take only one box because they have rationally adopted the disposition to do so.
So: these folk had got the right answer, and any debate with them is over terminology.
(Looks over Tim Tyler’s general trend in comments.)
Okay. It’s helpful that you’re doing a literature search. It’s not helpful that every time you find something remotely related, you feel a need to claim that it is already TDT and that TDT is nothing innovative by comparison. It does not appear to me that you understand either the general background of these questions as they have been pursued within decision theory, or TDT in particular. Literature search is great, but if you’re just spending 15 minutes Googling, then you have insufficient knowledge to compare the theories. Plenty of people have called for a decision theory that one-boxes on Newcomb and smokes on the smoking lesion—the question is coughing up something that seems reasonably formal. Plenty of people have advocated precommitment, but it comes with its own set of problems, and that is why a non-precommitment-based decision theory is important.
Well, other people have previously taken a crack at the same problem.
If they have resolved it, then I should think that would be helpful—since then you can look at their solution. If not, their efforts to solve the problem might still be enlightening.
So: I think my contribution in this area is probably helpful.
15 minutes was how long it took me to find the cited material in the first place. Not trivial—but not that hard.
No need to beat me up for not knowing the background of your own largely unpublished theory!
...but yes, in my view, advanced decision theory is a bit of a red herring for those interested in machine intelligence. It’s like: that is so not the problem. It seems like wondering whether to use butter-icing or marzipan on the top of the cake—when you don’t yet have the recipe or the ingredients.
So far, “Disposition-Based Decision Theory” (and its apparently-flawed precursor) is the only thing I have seen that apparently claims to address and solve the same problem that is under discussion in this forum:
I suppose there’s also a raft of CDT enthusiasts, who explain why two-boxing is actually not a flaw in their system, and that they have no objections to the idea of agents who one-box. In their case, the debate appears to be over terminology: what does the word “rational” actually mean—is it about choosing the best action from the available options? Or does it mean something else?
Are there other attempts at a solution? Your turn for some references, I feel.
Marion Ledwig’s thesis appears to be an overview of Newcomb’s Problem from 2000. That’s from before the disposition-based decision theory I referenced was worked out—and there’s minimal coverage.
Are you suggesting that there are some proposed solutions to the problem of building a decision theory that “does the right thing” somewhere in there, that pre-date disposition-based decision theory?
The main thesis there (in the section “Newcomb’s Problem as a Game against Nature”) seems to go against what many people think here—and is more along the lines of CDT.
Anyway, since it’s a 300 page thesis, perhaps you would like to be more specific.
Or maybe you are just waving in the general direction of the existing literature. In which case, I fail to see how that addresses my point.
“Paradoxes of Rationality and Cooperation” dates from 1985. That seems rather unlikely to have relevant coverage either. Again, it came too early—before the first attempts at a solution that I’m aware of.
People have been trying to solve the problem since the day it was presented, and it’s pretty clear that you don’t understand which parts of this particular solution are supposed to be novel. The main novel idea is the incorporation of logical uncertainty into Pearl-style causal graphs and the formulation of the counterfactuals as surgery over those causal graphs.
The idea that rationalists should make lots of money, versus the idea that rationalists should appear very reasonable, has been a central point of controversy from the beginning.
Talking about dispositions and precommitments has been going on since the beginning.
If you’re going to start waving judgments of novelty around, then read the literature.
One problem here is that the “particular solution” tha I am apparently expected to be understanding the novelty of hasn’t actually been published. Instead what we have is some notes.
The problem I was considering involves finding a method which obtains the “right” answer to problems like Newcomb’s problem and The Smoking Problem with a decision theory. If you are trying to solve some other problem, that’s fine.
If “precommitment” just means cutting off some of your options in advance, precommitment seems to be desirable—under various circumstances where you want to signal commitment—and believe that faked commitment signals would be detected. You use the term as though it is in some way negative.
It seems to me that I have not encountered the critics of precommitment saying what they mean by the term. Consequently, it is hard to see what problems they see with the idea.
This is the crippleware version of TDT that pure CDT agents self-modify to. It’s crippleware because if you self-modify at 7:00pm you’ll two-box against an Omega who saw your code at 6:59am.
By hypothesis, Omega on examining your code at 6:59, knows that you will self-modify at 7:00 and one-box thereafter.
Consider that every TDT agent must be derived from a non-TDT agent. There is no difference in principle between “I used to adhere to CDT but self-modified to TDT” and “I didn’t understand TDT when I was a child, but I follow it now as an adult”.
By hypothesis, Omega on examining your code at 6:59, knows that you will self-modify at 7:00 and one-box thereafter.
CDT agents don’t care. They aren’t causing Omega to fill box B by changing their source code at 7pm, so they have no reason to change their source code in a way that takes only one box. The source code change only causes Omega to fill box B if Omega looks at their source code after 7pm. That is how CDT agents (unwisely) compute “causes”.
Yes, but the CDT agent at seven o’clock is not being asked to choose one or two boxes. It has to choose between rewriting its algorithm to plain TDT (or DBDT or some variant that will one box), or to TDT with an exception clause “but use the old algorithm if you find out Omega’s prediction was made before seven o’clock”. Even by straight CDT, there is no motive for writing that exception.
Even by straight CDT, there is no motive for writing that exception.
This is the point at which I say “Wrong” and “Read the literature”. I’m not sure how I can explain this any more clearly than I have already, barring a full-fledged sequence. At 7pm the CDT agent calculates that if it modifies its source to use the old algorithm in cases where Omega saw the code before 7pm, it will get an extra thousand dollars on Newcomb’s Problem, since it will take box A which contains an additional thousand dollars, and since its decision to modify its code at 7pm has no effect on an Omega who saw the code before 7pm, hence no effect on whether box B is full. It does not reason “but Omega knows I will change my code”. If it reasoned that way it would be TDT, not CDT, and would one-box to begin with.
Actually I will add another comment because I can now articulate where the ambiguity comes in: how you add self modification to CDT (which doesn’t have it in the usual form); I’ve been assuming the original algorithm doesn’t try to micromanage the new algorithm’s decisions (which strikes me as the sensible way, not least because it gives better results here); you’ve been assuming it does (which I suppose you could argue, is more true to the spirit of the original CDT).
I still disagree, but I agree that we have hit the limits of discussion in this comment thread; fundamentally this needs to be analyzed in a more precise language than English. We can revisit it if either of us ever gets to actually programming anything like this.
By hypothesis, Omega on examining your code at 6:59, knows that you will self-modify at 7:00 and two-box thereafter.
By what hypothesis? That is not how the proposed Disposition-Based Decision Theory says it works. It claims to result in agents who have the disposition to one-box.
Sure. This sub thread was about plain CDT, and how it self-modifies into some form of DBDT/TDT once it figures out the benefits of doing so—and given the hypothesis of an omniscient Omega, then Omega will know that this will occur.
A way of reconciling the two sides of the debate about Newcomb’s problem acknowledges that a rational person should prepare for the problem by cultivating a disposition to one-box. Then whenever the problem arises, the disposition will prompt a prediction of one-boxing and afterwards the act of one-boxing (still freely chosen). Causal decision theory may acknowledge the value of this preparation. It may conclude that cultivating a disposition to one-box is rational although one-boxing itself is irrational. Hence, if in Newcomb’s problem an agent two-boxes, causal decision theory may concede that the agent did not rationally prepare for the problem. It nonetheless maintains that two-boxing itself is rational. Although two-boxing is not the act of a maximally rational agent, it is rational given the circumstances of Newcomb’s problem.
The basic idea of forming a disposition to one-box has been around for a while. Here’s another one:
Prior to entering Newcomb’s Problem, it is rational to form the disposition to one-box.
Realistic decision theory: rules for nonideal agents … by Paul Weirich − 2004
This stronger view employs a disposition-based
conception of rationality; it holds that what should be directly
assessed for ‘rationality’ is dispositions to choose rather than
choices themselves. Intuitively, there is a lot to be said for the
disposition to choose one-box in Newcomb’s problem – people
who go into Newcomb’s problem with this disposition reliably
come out much richer than people who instead go in with the
disposition to choose two-boxes. Similarly, the disposition to
cooperate in a psychologically-similar prisoners’ dilemma
reliably fares much better in this scenario than does the
disposition to defect. A disposition-based conception of
rationality holds that these intuitive observations about
dispositions capture an important insight into the nature of
practical rationality.
In Eliezer’s article on Newcomb’s problem, he says, “Omega has been correct on each of 100 observed occasions so far—everyone who took both boxes has found box B empty and received only a thousand dollars; everyone who took only box B has found B containing a million dollars. ” Such evidence from previous players fails to appear in some problem descriptions, including Wikipedia’s.
For me this is a “no-brainer”. Take box B, deposit it, and come back for more. That’s what the physical evidence says. Any philosopher who says “Taking BOTH boxes is the rational action,” occurs to me as an absolute fool in the face of the evidence. (But I’ve never understood non-mathematical philosophy anyway, so I may a poor judge.)
Yes, I read about ” … disappears in a puff of smoke.” I wasn’t coming back for a measly $1K, I was coming back for another million! I’ll see if they’ll let me play again. Omega already KNOWS I’m greedy, this won’t come as a shock. He’ll probably have told his team what to say when I try it.
″ … and come back for more.” was meant to be funny.
Anyway, this still doesn’t answer my questions about “Omega has been correct on each of 100 observed occasions so far—everyone who took both boxes has found box B empty and received only a thousand dollars; everyone who took only box B has found B containing a million dollars.”
The problem needs lots of little hypotheses about Omega. In general, you can create these hypotheses for yourself, using the principle of “Least Convenient Possible World”
In your case, I think you need to add at least two helper assumptions—Omega’s prediction abilities are trustworthy, and Omega’s offer will never be repeated—not for you, not for anyone.
Well, I mulled that over for a while, and I can’t see any way that contributes to answering my questions.
As to ” … what does your choice effect and when?”, I suppose there are common causes starting before Omega loaded the boxes, that affect both Omega’s choices and mine. For example, the machinery of my brain. No backwards-in-time is required.
This is the crippleware version of TDT that pure CDT agents self-modify to. It’s crippleware because if you self-modify at 7:00pm you’ll two-box against an Omega who saw your code at 6:59am.
Penalising a rational agent for its character flaws while it is under construction seems like a rather weak objection.
Most systems have a construction phase during which they may behave imperfectly—so similar objections seem likely to apply to practically any system. However, this is surely no big deal: once a synthetic rational agent exists, we can copy its brain. After that, developmental mistakes would no longer be much of a factor.
It does seem as though this makes CDT essentially correct—in a sense. The main issue would then become one of terminology—of what the word “rational” means. There would be no significant difference over how agents should behave, though.
My reading of this issue is that the case goes against CDT. Its terminology is misleading. I don’t think there’s much of a case that it is wrong, though.
Eric Barnes—while appreciating the benefits of taking one box—has harsh words for the “taking one box is rational” folk.
I go on to claim that although the ideal strategy is to adopt a necessitating disposition to take only one box, it is never rational to choose only one box. I defend my answer against the alternative analysis of the paradox provided by David Gauthier, and I conclude that his understanding of the orthodox theory of rationality is mistaken.
Yes, causal decision theorists have been saying harsh words against the winners on Newcomb’s Problem since the dawn of causal decision theory. I am replying to them.
Yes, causal decision theorists have been saying harsh words against the winners on Newcomb’s Problem since the dawn of causal decision theory. I am replying to them.
Newcomb’s Problem capriciously rewards irrational people in the same way that reality capriciously rewards people who irrationally believe their choices matter.
Re: “Some concluding chiding of those philosophers who blithely decided that the “rational” course of action systematically loses”
Some of those philosophers draw a distinction between rational action and the actions of a rational agent—see here:
So: these folk had got the right answer, and any debate with them is over terminology.
(Looks over Tim Tyler’s general trend in comments.)
Okay. It’s helpful that you’re doing a literature search. It’s not helpful that every time you find something remotely related, you feel a need to claim that it is already TDT and that TDT is nothing innovative by comparison. It does not appear to me that you understand either the general background of these questions as they have been pursued within decision theory, or TDT in particular. Literature search is great, but if you’re just spending 15 minutes Googling, then you have insufficient knowledge to compare the theories. Plenty of people have called for a decision theory that one-boxes on Newcomb and smokes on the smoking lesion—the question is coughing up something that seems reasonably formal. Plenty of people have advocated precommitment, but it comes with its own set of problems, and that is why a non-precommitment-based decision theory is important.
In the spirit of dredging up references with no actual deep insight, I note this recent post on Andrew Gelman’s blog.
Well, other people have previously taken a crack at the same problem.
If they have resolved it, then I should think that would be helpful—since then you can look at their solution. If not, their efforts to solve the problem might still be enlightening.
So: I think my contribution in this area is probably helpful.
15 minutes was how long it took me to find the cited material in the first place. Not trivial—but not that hard.
No need to beat me up for not knowing the background of your own largely unpublished theory!
...but yes, in my view, advanced decision theory is a bit of a red herring for those interested in machine intelligence. It’s like: that is so not the problem. It seems like wondering whether to use butter-icing or marzipan on the top of the cake—when you don’t yet have the recipe or the ingredients.
The cited material isn’t much different from a lot of other material in the same field.
So far, “Disposition-Based Decision Theory” (and its apparently-flawed precursor) is the only thing I have seen that apparently claims to address and solve the same problem that is under discussion in this forum:
I suppose there’s also a raft of CDT enthusiasts, who explain why two-boxing is actually not a flaw in their system, and that they have no objections to the idea of agents who one-box. In their case, the debate appears to be over terminology: what does the word “rational” actually mean—is it about choosing the best action from the available options? Or does it mean something else?
Are there other attempts at a solution? Your turn for some references, I feel.
“Paradoxes of Rationality and Cooperation” (the edited volume) will give you a feel for the basics, as will reading Marion Ledwig’s thesis paper.
Marion Ledwig’s thesis appears to be an overview of Newcomb’s Problem from 2000. That’s from before the disposition-based decision theory I referenced was worked out—and there’s minimal coverage.
Are you suggesting that there are some proposed solutions to the problem of building a decision theory that “does the right thing” somewhere in there, that pre-date disposition-based decision theory?
The main thesis there (in the section “Newcomb’s Problem as a Game against Nature”) seems to go against what many people think here—and is more along the lines of CDT.
Anyway, since it’s a 300 page thesis, perhaps you would like to be more specific.
Or maybe you are just waving in the general direction of the existing literature. In which case, I fail to see how that addresses my point.
“Paradoxes of Rationality and Cooperation” dates from 1985. That seems rather unlikely to have relevant coverage either. Again, it came too early—before the first attempts at a solution that I’m aware of.
People have been trying to solve the problem since the day it was presented, and it’s pretty clear that you don’t understand which parts of this particular solution are supposed to be novel. The main novel idea is the incorporation of logical uncertainty into Pearl-style causal graphs and the formulation of the counterfactuals as surgery over those causal graphs.
The idea that rationalists should make lots of money, versus the idea that rationalists should appear very reasonable, has been a central point of controversy from the beginning.
Talking about dispositions and precommitments has been going on since the beginning.
If you’re going to start waving judgments of novelty around, then read the literature.
One problem here is that the “particular solution” tha I am apparently expected to be understanding the novelty of hasn’t actually been published. Instead what we have is some notes.
The problem I was considering involves finding a method which obtains the “right” answer to problems like Newcomb’s problem and The Smoking Problem with a decision theory. If you are trying to solve some other problem, that’s fine.
If “precommitment” just means cutting off some of your options in advance, precommitment seems to be desirable—under various circumstances where you want to signal commitment—and believe that faked commitment signals would be detected. You use the term as though it is in some way negative.
It seems to me that I have not encountered the critics of precommitment saying what they mean by the term. Consequently, it is hard to see what problems they see with the idea.
This is the crippleware version of TDT that pure CDT agents self-modify to. It’s crippleware because if you self-modify at 7:00pm you’ll two-box against an Omega who saw your code at 6:59am.
By hypothesis, Omega on examining your code at 6:59, knows that you will self-modify at 7:00 and one-box thereafter.
Consider that every TDT agent must be derived from a non-TDT agent. There is no difference in principle between “I used to adhere to CDT but self-modified to TDT” and “I didn’t understand TDT when I was a child, but I follow it now as an adult”.
Correction made, thanks to Tim Tyler.
CDT agents don’t care. They aren’t causing Omega to fill box B by changing their source code at 7pm, so they have no reason to change their source code in a way that takes only one box. The source code change only causes Omega to fill box B if Omega looks at their source code after 7pm. That is how CDT agents (unwisely) compute “causes”.
Yes, but the CDT agent at seven o’clock is not being asked to choose one or two boxes. It has to choose between rewriting its algorithm to plain TDT (or DBDT or some variant that will one box), or to TDT with an exception clause “but use the old algorithm if you find out Omega’s prediction was made before seven o’clock”. Even by straight CDT, there is no motive for writing that exception.
This is the point at which I say “Wrong” and “Read the literature”. I’m not sure how I can explain this any more clearly than I have already, barring a full-fledged sequence. At 7pm the CDT agent calculates that if it modifies its source to use the old algorithm in cases where Omega saw the code before 7pm, it will get an extra thousand dollars on Newcomb’s Problem, since it will take box A which contains an additional thousand dollars, and since its decision to modify its code at 7pm has no effect on an Omega who saw the code before 7pm, hence no effect on whether box B is full. It does not reason “but Omega knows I will change my code”. If it reasoned that way it would be TDT, not CDT, and would one-box to begin with.
Actually I will add another comment because I can now articulate where the ambiguity comes in: how you add self modification to CDT (which doesn’t have it in the usual form); I’ve been assuming the original algorithm doesn’t try to micromanage the new algorithm’s decisions (which strikes me as the sensible way, not least because it gives better results here); you’ve been assuming it does (which I suppose you could argue, is more true to the spirit of the original CDT).
I still disagree, but I agree that we have hit the limits of discussion in this comment thread; fundamentally this needs to be analyzed in a more precise language than English. We can revisit it if either of us ever gets to actually programming anything like this.
By what hypothesis? That is not how the proposed Disposition-Based Decision Theory says it works. It claims to result in agents who have the disposition to one-box.
Sure. This sub thread was about plain CDT, and how it self-modifies into some form of DBDT/TDT once it figures out the benefits of doing so—and given the hypothesis of an omniscient Omega, then Omega will know that this will occur.
In that case, what I think you meant to say was:
Doh! Thanks for the correction, editing comment.
I don’t see any reason for thinking this fellow’s work represents “crippleware”.
It seems to me that he agrees with you regarding actions, but differs about terminology.
Here’s the CDT explanation of the terminology:
The basic idea of forming a disposition to one-box has been around for a while. Here’s another one:
Realistic decision theory: rules for nonideal agents … by Paul Weirich − 2004
...and another one:
“DISPOSITION-BASED DECISION THEORY”
In Eliezer’s article on Newcomb’s problem, he says, “Omega has been correct on each of 100 observed occasions so far—everyone who took both boxes has found box B empty and received only a thousand dollars; everyone who took only box B has found B containing a million dollars. ” Such evidence from previous players fails to appear in some problem descriptions, including Wikipedia’s.
For me this is a “no-brainer”. Take box B, deposit it, and come back for more. That’s what the physical evidence says. Any philosopher who says “Taking BOTH boxes is the rational action,” occurs to me as an absolute fool in the face of the evidence. (But I’ve never understood non-mathematical philosophy anyway, so I may a poor judge.)
Clarifying (NOT rhetorical) questions:
Have I just cheated, so that “it’s not the Newcomb Problem anymore?”
When you fellows say a certain decision theory “two-boxes”, are those theory-calculations including the previous play evidence or not?
Thanks for your time and attention.
There is no opportunity to come back for more. Assume that when you take box B before taking box A, box A is removed.
Yes, I read about ” … disappears in a puff of smoke.” I wasn’t coming back for a measly $1K, I was coming back for another million! I’ll see if they’ll let me play again. Omega already KNOWS I’m greedy, this won’t come as a shock. He’ll probably have told his team what to say when I try it.
″ … and come back for more.” was meant to be funny.
Anyway, this still doesn’t answer my questions about “Omega has been correct on each of 100 observed occasions so far—everyone who took both boxes has found box B empty and received only a thousand dollars; everyone who took only box B has found B containing a million dollars.”
Someone please answer my questions! Thanks!
The problem needs lots of little hypotheses about Omega. In general, you can create these hypotheses for yourself, using the principle of “Least Convenient Possible World”
http://lesswrong.com/lw/2k/the_least_convenient_possible_world/
Or, from philosophy/argumentation theory, “Principle of Charity”.
http://philosophy.lander.edu/intro/charity.shtml
In your case, I think you need to add at least two helper assumptions—Omega’s prediction abilities are trustworthy, and Omega’s offer will never be repeated—not for you, not for anyone.
What the physical evidence says is that the boxes are there, the money is there, and Omega is gone. So what does your choice effect and when?
Well, I mulled that over for a while, and I can’t see any way that contributes to answering my questions.
As to ” … what does your choice effect and when?”, I suppose there are common causes starting before Omega loaded the boxes, that affect both Omega’s choices and mine. For example, the machinery of my brain. No backwards-in-time is required.
Penalising a rational agent for its character flaws while it is under construction seems like a rather weak objection. Most systems have a construction phase during which they may behave imperfectly—so similar objections seem likely to apply to practically any system. However, this is surely no big deal: once a synthetic rational agent exists, we can copy its brain. After that, developmental mistakes would no longer be much of a factor.
It does seem as though this makes CDT essentially correct—in a sense. The main issue would then become one of terminology—of what the word “rational” means. There would be no significant difference over how agents should behave, though.
My reading of this issue is that the case goes against CDT. Its terminology is misleading. I don’t think there’s much of a case that it is wrong, though.
Eric Barnes—while appreciating the benefits of taking one box—has harsh words for the “taking one box is rational” folk.
(Sigh.)
Yes, causal decision theorists have been saying harsh words against the winners on Newcomb’s Problem since the dawn of causal decision theory. I am replying to them.
Note that this is the same guy who says:
He’s drawing a distinction between a “rational action” and the actions of a “rational agent”.
Newcomb’s Problem capriciously rewards irrational people in the same way that reality capriciously rewards people who irrationally believe their choices matter.