I don’t understand. You are answering my “show me”? Standard game theory says to defect in both PD and TPD. You have a revisionist decision theory that does better?
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while. If for some reason Eliezer changed his mind and decided not to publish it then I’d be somewhat surprised. I’m guessing he’s nervous because it’s his first opportunity to show academia that he’s a real researcher and not just a somewhat bright autodidact.
There was a decision theory workshop a few months ago and a bunch of decision theorists are still working on solving the comparably much harder problems that were introduced at that time. Decision theory is still unsolved but UDT/TDT/ADT/XDT are a lot closer to solving it than the ancient CDT/EDT/SDT.
On the assumption that SDT stands for Sequential Decision Theory, I would like to take a shot at explaining this one, as well as at clarifying the relationship among CDT, EDT, and SDT. Everyone feel free to amend and extend my remarks.
Start with simple Bayesian updating. This is a theory of knowing, not a theory of acting. It helps you to know about the world, but doesn’t tell you what to do with your knowlege (other than to get more knowlege). There are two ways you can go from here: SDT and EDT.
SDT is basically game theory as developed by Seldin and Harsanyi. It adds agents, actions, and preferences to the world of propositions which exists in simple Bayesianism. Given the preferences of each agent regarding the propositions, and the agents’ beliefs about the effects which their actions have on the truth or falshood of propositions regarding which they have preferences, SDT advises each agent on their choice of actions. It is “sequential” because the decisions have to be considered in strict temporal order. For example, in the Parfit’s hitchhiker problem, both the hitchhiker and the motorist probably wish that the hitchhiker decision to pay $100 could be made before the motorist decision whether to offer a ride. But, in SDT, the decisions can not be made in this reverse order. By the same token, you cannot observe the future before deciding in the present.
If at least some of the agents in SDT believe that some of the other agents are rational, then you have game theory and things can get complicated. On the other hand, if you have only one agent, or if none of the agents believe that the others are rational, then you have classical decision theory which goes back to Wald (1939).
EDT is a variant of single-agent SDT due to Richard Jeffrey (1960s). In it, actions are treated just like any other proposition, except that some agents can make decisions that set action-propositions to be either true or false. The most interesting thing about EDT is that it is relatively “timeless”. That is, if X is an action, and A is an agent, then (A does X) might be thought of as a proposition. Using ordinary propositional logic, you can build and reason with compound propositions such as P → (A does X), (A does X)&Q->P, or (A does X)->(B does Y). The “timeless” aspect to this is that “A does X” is interpreted as “Either A did X, or is currently doing X, or will do X; I don’t really care about when it happens”.
The thing that makes EDT into a decision theory is the rule which says roughly “Act so as to make your preferred propositions true. If EDT worked as well as SDT, it would definitely be considered better, if only because of Ockham’s razor. It is an extremely elegant and simple theory. And it does work remarkably well. The most famous case where it doesn’t work (at least according to the SDT fans) is Newcomb’s problem. SDT says to two-box (because your decision cannot affect Omega’s already frozen-in-time decision). EDT says to one box (because it can’t even notice that the causality goes the wrong way). SDT and EDT also disagree regarding the Hitchhiker.
CDT is an attempt to improve on both SDT and EDT. It seems to be a work in progress. There are two variants out there right now—one built by philosophers and the other primarily the work of economist Judea Pearl. (I think I prefer Pearl’s version.) CDT helps to clarify the relationship between causation and correlation in Bayesian epistemology (i.e. learning). It also clarifies the relationship between action-based propositions (which are modeled in both SDT and EDT as somehow getting their truth value from the free will of the agents, and other propositions which get their truth value from the laws of physics. In CDT (Pearl’s version, at least) an action can be both free and determined—the flexibility reminds me of the compatibilist dissolution of the free will question which is suggested by the LW sequences.
I don’t know whether that summary answers the question you wanted answered, but I’m pretty sure the corrections I am likely to receive will answer the questions I want answered. :)
I think ADT is only described on Vladimir Nesov’s blog (if there) and XDT nowhere findable. ADT stands for Ambient Decision Theory. Unfortunately there’s no comprehensive and easy summary of any of the modern decision theories anywhere. Hopefully Eliezer publishes his TDT manuscript soon.
I coined the name XDT here. I think Anna Salamon and Steve Rayhawk had come up with essentially the same idea prior to that (and have explored its implications more deeply, but not in published form).
Thanks. I couldn’t find any references to ADT on Vladimir Nesov’s blog but I only had a quick scan so maybe I missed it, will have a better look later. And I can now remember that series of comments on XDT but my mind didn’t connect to it, thanks for the link.
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while.
Wow. I didn’t realise Eliezer had decided to actually release something formally. My recollection was that he was refusing to work on it unless someone promised him a PhD.
Does better how? By cooperating? By achieving an reverse-Omega-like stance and somehow constraining the other player to cooperate, conditionally on cooperating ourselves? I am completely mystified. I guess I will have to wait for the paper(s).
As I said, I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code.
Thanks for the link. It was definitely telegraphic, but I think I got a pretty good notion where he is coming from with this, and also a bit about where he is going. I’m sure you remember the old days back at sci.bio.evolution talking about the various complications with the gene-level view of selection and Hamilton’s rule. Well, give another read to EY’s einsatz explanation of TDT:
The one-sentence version is: Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.
Does that remind you of anything? “As you are deciding how the expression of you as a gene is going to affect the organism, remember to take into account that you are deciding for all of the members of your gene clone, and that changing the expression of your clone in other organisms is going to have an impact on the fitness of your own containing organism.” Now that is really cool. For the first time I begin to see how different decision theories might be appropriate for different meanings of the term “rational agent”.
I can’t claim to have understood everything EY wrote in that sketch, but I did imagine that I understood his concerns regarding “contrafactual surgery”. I want to get a hold of a preprint of the paper, when it is ready.
I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code. In which case, C-C is perfectly possible.
True prisoner’s dilemma. Also, the prisoner’s dilemma generally. Newcomb just exemplifies the theme.
I don’t understand. You are answering my “show me”? Standard game theory says to defect in both PD and TPD. You have a revisionist decision theory that does better?
TDT does better, yes. My apologies; I’d forgotten the manuscript hasn’t yet been released to the public. It should be soon, I think; it’s been in the review process for a while. If for some reason Eliezer changed his mind and decided not to publish it then I’d be somewhat surprised. I’m guessing he’s nervous because it’s his first opportunity to show academia that he’s a real researcher and not just a somewhat bright autodidact.
There was a decision theory workshop a few months ago and a bunch of decision theorists are still working on solving the comparably much harder problems that were introduced at that time. Decision theory is still unsolved but UDT/TDT/ADT/XDT are a lot closer to solving it than the ancient CDT/EDT/SDT.
At the risk of looking stupid:
What are ADT and XDT?
For that matter, what’s SDT?
On the assumption that SDT stands for Sequential Decision Theory, I would like to take a shot at explaining this one, as well as at clarifying the relationship among CDT, EDT, and SDT. Everyone feel free to amend and extend my remarks.
Start with simple Bayesian updating. This is a theory of knowing, not a theory of acting. It helps you to know about the world, but doesn’t tell you what to do with your knowlege (other than to get more knowlege). There are two ways you can go from here: SDT and EDT.
SDT is basically game theory as developed by Seldin and Harsanyi. It adds agents, actions, and preferences to the world of propositions which exists in simple Bayesianism. Given the preferences of each agent regarding the propositions, and the agents’ beliefs about the effects which their actions have on the truth or falshood of propositions regarding which they have preferences, SDT advises each agent on their choice of actions. It is “sequential” because the decisions have to be considered in strict temporal order. For example, in the Parfit’s hitchhiker problem, both the hitchhiker and the motorist probably wish that the hitchhiker decision to pay $100 could be made before the motorist decision whether to offer a ride. But, in SDT, the decisions can not be made in this reverse order. By the same token, you cannot observe the future before deciding in the present.
If at least some of the agents in SDT believe that some of the other agents are rational, then you have game theory and things can get complicated. On the other hand, if you have only one agent, or if none of the agents believe that the others are rational, then you have classical decision theory which goes back to Wald (1939).
EDT is a variant of single-agent SDT due to Richard Jeffrey (1960s). In it, actions are treated just like any other proposition, except that some agents can make decisions that set action-propositions to be either true or false. The most interesting thing about EDT is that it is relatively “timeless”. That is, if X is an action, and A is an agent, then (A does X) might be thought of as a proposition. Using ordinary propositional logic, you can build and reason with compound propositions such as P → (A does X), (A does X)&Q->P, or (A does X)->(B does Y). The “timeless” aspect to this is that “A does X” is interpreted as “Either A did X, or is currently doing X, or will do X; I don’t really care about when it happens”.
The thing that makes EDT into a decision theory is the rule which says roughly “Act so as to make your preferred propositions true. If EDT worked as well as SDT, it would definitely be considered better, if only because of Ockham’s razor. It is an extremely elegant and simple theory. And it does work remarkably well. The most famous case where it doesn’t work (at least according to the SDT fans) is Newcomb’s problem. SDT says to two-box (because your decision cannot affect Omega’s already frozen-in-time decision). EDT says to one box (because it can’t even notice that the causality goes the wrong way). SDT and EDT also disagree regarding the Hitchhiker.
CDT is an attempt to improve on both SDT and EDT. It seems to be a work in progress. There are two variants out there right now—one built by philosophers and the other primarily the work of economist Judea Pearl. (I think I prefer Pearl’s version.) CDT helps to clarify the relationship between causation and correlation in Bayesian epistemology (i.e. learning). It also clarifies the relationship between action-based propositions (which are modeled in both SDT and EDT as somehow getting their truth value from the free will of the agents, and other propositions which get their truth value from the laws of physics. In CDT (Pearl’s version, at least) an action can be both free and determined—the flexibility reminds me of the compatibilist dissolution of the free will question which is suggested by the LW sequences.
I don’t know whether that summary answers the question you wanted answered, but I’m pretty sure the corrections I am likely to receive will answer the questions I want answered. :)
[Edit: corrected typos]
I think ADT is only described on Vladimir Nesov’s blog (if there) and XDT nowhere findable. ADT stands for Ambient Decision Theory. Unfortunately there’s no comprehensive and easy summary of any of the modern decision theories anywhere. Hopefully Eliezer publishes his TDT manuscript soon.
I coined the name XDT here. I think Anna Salamon and Steve Rayhawk had come up with essentially the same idea prior to that (and have explored its implications more deeply, but not in published form).
Thanks. I couldn’t find any references to ADT on Vladimir Nesov’s blog but I only had a quick scan so maybe I missed it, will have a better look later. And I can now remember that series of comments on XDT but my mind didn’t connect to it, thanks for the link.
DT list, nothing on the blog. Hopefully I’ll write up the current variant (which is conceptually somewhat different) in the near future.
Wow. I didn’t realise Eliezer had decided to actually release something formally. My recollection was that he was refusing to work on it unless someone promised him a PhD.
Does better how? By cooperating? By achieving an reverse-Omega-like stance and somehow constraining the other player to cooperate, conditionally on cooperating ourselves? I am completely mystified. I guess I will have to wait for the paper(s).
I don’t think there are any papers. There’s only this ramble:
http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/
As I said, I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code.
Thanks for the link. It was definitely telegraphic, but I think I got a pretty good notion where he is coming from with this, and also a bit about where he is going. I’m sure you remember the old days back at sci.bio.evolution talking about the various complications with the gene-level view of selection and Hamilton’s rule. Well, give another read to EY’s einsatz explanation of TDT:
Does that remind you of anything? “As you are deciding how the expression of you as a gene is going to affect the organism, remember to take into account that you are deciding for all of the members of your gene clone, and that changing the expression of your clone in other organisms is going to have an impact on the fitness of your own containing organism.” Now that is really cool. For the first time I begin to see how different decision theories might be appropriate for different meanings of the term “rational agent”.
I can’t claim to have understood everything EY wrote in that sketch, but I did imagine that I understood his concerns regarding “contrafactual surgery”. I want to get a hold of a preprint of the paper, when it is ready.
I think your correspondents are in rather a muddle—and are discussing a completely different and rather esoteric PD case—where the agents can see and verify each other’s source code. In which case, C-C is perfectly possible.