Have you read the original article? The payoff is less if you follow ordinary decision theory, and yet the whole point of decision theory is to maximize the payoff.
Yes, I read that article, and at least a half dozen articles along the same line, and dozens of pages of commentary. I also remember the first LW article that I read; something about making beliefs “pay rent” in anticipated experiences. Since I don’t anticipate receiving a visit from Omega or anyone else who can read my mind, please forgive me if I don’t take the superiority of revisionist decision theories seriously. Show me a credible example where it does better. One which doesn’t involve some kind of spooky transmission of information backward in time.
Since I don’t anticipate receiving a visit from Omega or anyone else who can read my mind, please forgive me if I don’t take the superiority of revisionist decision theories seriously.
You are missing the point. Newcomb’s problem, and other problems involving Omega, are unit tests for mathematical formalizations of decision theory. When a decision theory gets a contrived problem wrong, we don’t care because that scenario might appear in real life, but because it demonstrates that the math is wrong in a way that might make it subtly wrong on other problems, too.
I think you are missing the point. Newcomb’s problem is equivalent to dividing by zero. Decision theories aren’t supposed to behave well when abused in this way. If they behave badly on this problem, maybe it is the fault of the problem rather than the fault of the theory.
If someone can present a more robust decision theory, UDT or TDT, or whatever, which handles all the well formed problems just as well as standard game theory, and also handles the ill-formed problems like Newcomb in accord with EY’s intuitions, then I think that is great. I look forward to reading the papers and textbooks explaining that decision theory. But until they have gone through at least some serious process of peer review, please forgive me if I dismiss them as just so much woo and/or vaporware.
Incidentally, I specified “EY’s intuitions” rather than “correctness” as the criterion of success, because unless Omega actually appears and submits to a series of empirical tests, I can’t imagine a more respectable empirical criterion.
No, randomness is kind of a red herring. I shouldn’t have brought it up.
At one point I thought I had a kind of Dutch Book argument against Omega—if he could predict some future “random” event which I intended to use in conjunction with a mixed strategy, then I should be able to profit by making side bets “hedging” my choice with respect to Omega. But when I looked more carefully, it didn’t work.
I don’t understand. You are answering my “show me”? Standard game theory says to defect in both PD and TPD. You have a revisionist decision theory that does better?
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while. If for some reason Eliezer changed his mind and decided not to publish it then I’d be somewhat surprised. I’m guessing he’s nervous because it’s his first opportunity to show academia that he’s a real researcher and not just a somewhat bright autodidact.
There was a decision theory workshop a few months ago and a bunch of decision theorists are still working on solving the comparably much harder problems that were introduced at that time. Decision theory is still unsolved but UDT/TDT/ADT/XDT are a lot closer to solving it than the ancient CDT/EDT/SDT.
On the assumption that SDT stands for Sequential Decision Theory, I would like to take a shot at explaining this one, as well as at clarifying the relationship among CDT, EDT, and SDT. Everyone feel free to amend and extend my remarks.
Start with simple Bayesian updating. This is a theory of knowing, not a theory of acting. It helps you to know about the world, but doesn’t tell you what to do with your knowlege (other than to get more knowlege). There are two ways you can go from here: SDT and EDT.
SDT is basically game theory as developed by Seldin and Harsanyi. It adds agents, actions, and preferences to the world of propositions which exists in simple Bayesianism. Given the preferences of each agent regarding the propositions, and the agents’ beliefs about the effects which their actions have on the truth or falshood of propositions regarding which they have preferences, SDT advises each agent on their choice of actions. It is “sequential” because the decisions have to be considered in strict temporal order. For example, in the Parfit’s hitchhiker problem, both the hitchhiker and the motorist probably wish that the hitchhiker decision to pay $100 could be made before the motorist decision whether to offer a ride. But, in SDT, the decisions can not be made in this reverse order. By the same token, you cannot observe the future before deciding in the present.
If at least some of the agents in SDT believe that some of the other agents are rational, then you have game theory and things can get complicated. On the other hand, if you have only one agent, or if none of the agents believe that the others are rational, then you have classical decision theory which goes back to Wald (1939).
EDT is a variant of single-agent SDT due to Richard Jeffrey (1960s). In it, actions are treated just like any other proposition, except that some agents can make decisions that set action-propositions to be either true or false. The most interesting thing about EDT is that it is relatively “timeless”. That is, if X is an action, and A is an agent, then (A does X) might be thought of as a proposition. Using ordinary propositional logic, you can build and reason with compound propositions such as P → (A does X), (A does X)&Q->P, or (A does X)->(B does Y). The “timeless” aspect to this is that “A does X” is interpreted as “Either A did X, or is currently doing X, or will do X; I don’t really care about when it happens”.
The thing that makes EDT into a decision theory is the rule which says roughly “Act so as to make your preferred propositions true. If EDT worked as well as SDT, it would definitely be considered better, if only because of Ockham’s razor. It is an extremely elegant and simple theory. And it does work remarkably well. The most famous case where it doesn’t work (at least according to the SDT fans) is Newcomb’s problem. SDT says to two-box (because your decision cannot affect Omega’s already frozen-in-time decision). EDT says to one box (because it can’t even notice that the causality goes the wrong way). SDT and EDT also disagree regarding the Hitchhiker.
CDT is an attempt to improve on both SDT and EDT. It seems to be a work in progress. There are two variants out there right now—one built by philosophers and the other primarily the work of economist Judea Pearl. (I think I prefer Pearl’s version.) CDT helps to clarify the relationship between causation and correlation in Bayesian epistemology (i.e. learning). It also clarifies the relationship between action-based propositions (which are modeled in both SDT and EDT as somehow getting their truth value from the free will of the agents, and other propositions which get their truth value from the laws of physics. In CDT (Pearl’s version, at least) an action can be both free and determined—the flexibility reminds me of the compatibilist dissolution of the free will question which is suggested by the LW sequences.
I don’t know whether that summary answers the question you wanted answered, but I’m pretty sure the corrections I am likely to receive will answer the questions I want answered. :)
I think ADT is only described on Vladimir Nesov’s blog (if there) and XDT nowhere findable. ADT stands for Ambient Decision Theory. Unfortunately there’s no comprehensive and easy summary of any of the modern decision theories anywhere. Hopefully Eliezer publishes his TDT manuscript soon.
I coined the name XDT here. I think Anna Salamon and Steve Rayhawk had come up with essentially the same idea prior to that (and have explored its implications more deeply, but not in published form).
Thanks. I couldn’t find any references to ADT on Vladimir Nesov’s blog but I only had a quick scan so maybe I missed it, will have a better look later. And I can now remember that series of comments on XDT but my mind didn’t connect to it, thanks for the link.
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while.
Wow. I didn’t realise Eliezer had decided to actually release something formally. My recollection was that he was refusing to work on it unless someone promised him a PhD.
Does better how? By cooperating? By achieving an reverse-Omega-like stance and somehow constraining the other player to cooperate, conditionally on cooperating ourselves? I am completely mystified. I guess I will have to wait for the paper(s).
As I said, I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code.
Thanks for the link. It was definitely telegraphic, but I think I got a pretty good notion where he is coming from with this, and also a bit about where he is going. I’m sure you remember the old days back at sci.bio.evolution talking about the various complications with the gene-level view of selection and Hamilton’s rule. Well, give another read to EY’s einsatz explanation of TDT:
The one-sentence version is: Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.
Does that remind you of anything? “As you are deciding how the expression of you as a gene is going to affect the organism, remember to take into account that you are deciding for all of the members of your gene clone, and that changing the expression of your clone in other organisms is going to have an impact on the fitness of your own containing organism.” Now that is really cool. For the first time I begin to see how different decision theories might be appropriate for different meanings of the term “rational agent”.
I can’t claim to have understood everything EY wrote in that sketch, but I did imagine that I understood his concerns regarding “contrafactual surgery”. I want to get a hold of a preprint of the paper, when it is ready.
I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code. In which case, C-C is perfectly possible.
Since I don’t anticipate receiving a visit from Omega or anyone else who can read my mind, please forgive me if I don’t take the superiority of revisionist decision theories seriously. Show me a credible example where it does better.
You have been given at least one such example in this thread and even had you not the process of taking an idealised problem and creating more mundane example should be one you are familiar with if you are as well versed in the literature as you claim.
Where was I given such an example? The only example I saw was of an unreliable Omega, an Omega who only gets it right 90% of the time.
If that is the example you mean, then (1) I agree that it adds unnecessary complexity by bringing in irrelevant considerations, and (2) I claim it is still f’ing impossible.
Have you read the original article? The payoff is less if you follow ordinary decision theory, and yet the whole point of decision theory is to maximize the payoff.
Yes, I read that article, and at least a half dozen articles along the same line, and dozens of pages of commentary. I also remember the first LW article that I read; something about making beliefs “pay rent” in anticipated experiences. Since I don’t anticipate receiving a visit from Omega or anyone else who can read my mind, please forgive me if I don’t take the superiority of revisionist decision theories seriously. Show me a credible example where it does better. One which doesn’t involve some kind of spooky transmission of information backward in time.
You are missing the point. Newcomb’s problem, and other problems involving Omega, are unit tests for mathematical formalizations of decision theory. When a decision theory gets a contrived problem wrong, we don’t care because that scenario might appear in real life, but because it demonstrates that the math is wrong in a way that might make it subtly wrong on other problems, too.
I think you are missing the point. Newcomb’s problem is equivalent to dividing by zero. Decision theories aren’t supposed to behave well when abused in this way. If they behave badly on this problem, maybe it is the fault of the problem rather than the fault of the theory.
If someone can present a more robust decision theory, UDT or TDT, or whatever, which handles all the well formed problems just as well as standard game theory, and also handles the ill-formed problems like Newcomb in accord with EY’s intuitions, then I think that is great. I look forward to reading the papers and textbooks explaining that decision theory. But until they have gone through at least some serious process of peer review, please forgive me if I dismiss them as just so much woo and/or vaporware.
Incidentally, I specified “EY’s intuitions” rather than “correctness” as the criterion of success, because unless Omega actually appears and submits to a series of empirical tests, I can’t imagine a more respectable empirical criterion.
IMO, you haven’t made a case for that—and few here agree with you.
If you really think randomness is an issue, imagine a deterministic program facing the problem, with no good source of randomness to hand.
No, randomness is kind of a red herring. I shouldn’t have brought it up.
At one point I thought I had a kind of Dutch Book argument against Omega—if he could predict some future “random” event which I intended to use in conjunction with a mixed strategy, then I should be able to profit by making side bets “hedging” my choice with respect to Omega. But when I looked more carefully, it didn’t work.
Yay: honesty points!
True prisoner’s dilemma. Also, the prisoner’s dilemma generally. Newcomb just exemplifies the theme.
I don’t understand. You are answering my “show me”? Standard game theory says to defect in both PD and TPD. You have a revisionist decision theory that does better?
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while. If for some reason Eliezer changed his mind and decided not to publish it then I’d be somewhat surprised. I’m guessing he’s nervous because it’s his first opportunity to show academia that he’s a real researcher and not just a somewhat bright autodidact.
There was a decision theory workshop a few months ago and a bunch of decision theorists are still working on solving the comparably much harder problems that were introduced at that time. Decision theory is still unsolved but UDT/TDT/ADT/XDT are a lot closer to solving it than the ancient CDT/EDT/SDT.
At the risk of looking stupid:
What are ADT and XDT?
For that matter, what’s SDT?
On the assumption that SDT stands for Sequential Decision Theory, I would like to take a shot at explaining this one, as well as at clarifying the relationship among CDT, EDT, and SDT. Everyone feel free to amend and extend my remarks.
Start with simple Bayesian updating. This is a theory of knowing, not a theory of acting. It helps you to know about the world, but doesn’t tell you what to do with your knowlege (other than to get more knowlege). There are two ways you can go from here: SDT and EDT.
SDT is basically game theory as developed by Seldin and Harsanyi. It adds agents, actions, and preferences to the world of propositions which exists in simple Bayesianism. Given the preferences of each agent regarding the propositions, and the agents’ beliefs about the effects which their actions have on the truth or falshood of propositions regarding which they have preferences, SDT advises each agent on their choice of actions. It is “sequential” because the decisions have to be considered in strict temporal order. For example, in the Parfit’s hitchhiker problem, both the hitchhiker and the motorist probably wish that the hitchhiker decision to pay $100 could be made before the motorist decision whether to offer a ride. But, in SDT, the decisions can not be made in this reverse order. By the same token, you cannot observe the future before deciding in the present.
If at least some of the agents in SDT believe that some of the other agents are rational, then you have game theory and things can get complicated. On the other hand, if you have only one agent, or if none of the agents believe that the others are rational, then you have classical decision theory which goes back to Wald (1939).
EDT is a variant of single-agent SDT due to Richard Jeffrey (1960s). In it, actions are treated just like any other proposition, except that some agents can make decisions that set action-propositions to be either true or false. The most interesting thing about EDT is that it is relatively “timeless”. That is, if X is an action, and A is an agent, then (A does X) might be thought of as a proposition. Using ordinary propositional logic, you can build and reason with compound propositions such as P → (A does X), (A does X)&Q->P, or (A does X)->(B does Y). The “timeless” aspect to this is that “A does X” is interpreted as “Either A did X, or is currently doing X, or will do X; I don’t really care about when it happens”.
The thing that makes EDT into a decision theory is the rule which says roughly “Act so as to make your preferred propositions true. If EDT worked as well as SDT, it would definitely be considered better, if only because of Ockham’s razor. It is an extremely elegant and simple theory. And it does work remarkably well. The most famous case where it doesn’t work (at least according to the SDT fans) is Newcomb’s problem. SDT says to two-box (because your decision cannot affect Omega’s already frozen-in-time decision). EDT says to one box (because it can’t even notice that the causality goes the wrong way). SDT and EDT also disagree regarding the Hitchhiker.
CDT is an attempt to improve on both SDT and EDT. It seems to be a work in progress. There are two variants out there right now—one built by philosophers and the other primarily the work of economist Judea Pearl. (I think I prefer Pearl’s version.) CDT helps to clarify the relationship between causation and correlation in Bayesian epistemology (i.e. learning). It also clarifies the relationship between action-based propositions (which are modeled in both SDT and EDT as somehow getting their truth value from the free will of the agents, and other propositions which get their truth value from the laws of physics. In CDT (Pearl’s version, at least) an action can be both free and determined—the flexibility reminds me of the compatibilist dissolution of the free will question which is suggested by the LW sequences.
I don’t know whether that summary answers the question you wanted answered, but I’m pretty sure the corrections I am likely to receive will answer the questions I want answered. :)
[Edit: corrected typos]
I think ADT is only described on Vladimir Nesov’s blog (if there) and XDT nowhere findable. ADT stands for Ambient Decision Theory. Unfortunately there’s no comprehensive and easy summary of any of the modern decision theories anywhere. Hopefully Eliezer publishes his TDT manuscript soon.
I coined the name XDT here. I think Anna Salamon and Steve Rayhawk had come up with essentially the same idea prior to that (and have explored its implications more deeply, but not in published form).
Thanks. I couldn’t find any references to ADT on Vladimir Nesov’s blog but I only had a quick scan so maybe I missed it, will have a better look later. And I can now remember that series of comments on XDT but my mind didn’t connect to it, thanks for the link.
DT list, nothing on the blog. Hopefully I’ll write up the current variant (which is conceptually somewhat different) in the near future.
Wow. I didn’t realise Eliezer had decided to actually release something formally. My recollection was that he was refusing to work on it unless someone promised him a PhD.
Does better how? By cooperating? By achieving an reverse-Omega-like stance and somehow constraining the other player to cooperate, conditionally on cooperating ourselves? I am completely mystified. I guess I will have to wait for the paper(s).
I don’t think there are any papers. There’s only this ramble:
http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/
As I said, I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code.
Thanks for the link. It was definitely telegraphic, but I think I got a pretty good notion where he is coming from with this, and also a bit about where he is going. I’m sure you remember the old days back at sci.bio.evolution talking about the various complications with the gene-level view of selection and Hamilton’s rule. Well, give another read to EY’s einsatz explanation of TDT:
Does that remind you of anything? “As you are deciding how the expression of you as a gene is going to affect the organism, remember to take into account that you are deciding for all of the members of your gene clone, and that changing the expression of your clone in other organisms is going to have an impact on the fitness of your own containing organism.” Now that is really cool. For the first time I begin to see how different decision theories might be appropriate for different meanings of the term “rational agent”.
I can’t claim to have understood everything EY wrote in that sketch, but I did imagine that I understood his concerns regarding “contrafactual surgery”. I want to get a hold of a preprint of the paper, when it is ready.
I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code. In which case, C-C is perfectly possible.
You have been given at least one such example in this thread and even had you not the process of taking an idealised problem and creating more mundane example should be one you are familiar with if you are as well versed in the literature as you claim.
Where was I given such an example? The only example I saw was of an unreliable Omega, an Omega who only gets it right 90% of the time.
If that is the example you mean, then (1) I agree that it adds unnecessary complexity by bringing in irrelevant considerations, and (2) I claim it is still f’ing impossible.