I applaud anyone who figures out stuff for themselves and posts it for the benefit of others, but this post is extremely unclear. How do you define these funky “causal probabilities” to someone who only knows regular conditional probabilities? And how can kissing the baby be evidence for anything, if it’s determined entirely by which decision theory you adopt? In short, I feel your “explanation” doesn’t look inside black boxes, only shuffles them around. I’d prefer a more formal treatment to ensure that no lions lurk in the shadows.
Are you referring to the fact that evidential decision theories rely on conditional probability, working as follows:
Modelling probability as ways the world could be (ie. if the world can be two ways and A is true in one of them then it’s 50% probable).
Imagine the world has ten ways to be, A is true in five of them, B is true in six but A and B are only true in 2.
So our probability of A is 5⁄10 = 1⁄2
Our probability of A given B is 2⁄6 = 1⁄3
Because:
B being true reduced the number of ways the world can be to six. Of these, in only 2 is A true.
So, the answer to the question about how kissing the baby is evidence is as above: It’s evidence because it rules out some possible ways that the world could be. For example, there may originally have been ten worlds, five where the politician was likeable and five where they weren’t. The politician kisses the baby in only one where he’s unlikeable and in four where he’s likeable. Kissing the baby then reduces it down to five ways the world could be and in four of these he’s likeable so by kissing the baby, the probability is higher that the world is such that he is likeable.
I don’t expect that to be a breakthrough to you. I’m just asking whether that’s the sort of thing you were thinking I should have said (but better crafted).
As to the difference between conditional and causal probability, causal probabilities would be a subset of the conditional ones where “A causes B”. What it means to say “A causes B” seems beyond the scope of an introductory article to me though. Or am I missing what you mean? Is there a simple way to explain what a causal probability is at an introductory level?
I think I’ve made it obvious in my postings here that I consider I have a lot to learn. If you think you can see a way I should have done it, I’d be really interested to know what it was and I could try and edit the post or write another post to explain it.
IMO, right now decision theory is not a settled topic to write tutorials like this about. You might say that I was dissatisfied with the tone of your post: it implied that somewhere there are wise mathematicians who know what “causal probabilities” mean, etc. In truth there are no such wise mathematicians. (Well, you could mention Judea Pearl as a first approximation, but AFAIK his work doesn’t settle the issues completely, and he uses a very different formalism.) Any honest introduction should clearly demarcate the dubious parts with “here be dragons”. When I started learning decision theory, introductions like yours wasted a huge amount of my time and frustrated me to no end, because they always seemed to assume things that just weren’t there in the specialized literature.
When I started learning decision theory, introductions like yours wasted a huge amount of my time and frustrated me to no end
You sound like someone who is in a position to write a great intro to DT. Would you consider doing that, or perhaps collaborating with this post’s author?
That would feel a bit like hijacking. The obvious candidates for writing such a post are Eliezer, Gary Drescher, or Wei Dai. I don’t know why they aren’t doing that, probably they feel that surveying the existing literature is quite enough to make yourself confused. I’ll think about your suggestion, though. If I find an honest and accessible angle, I’ll write a post.
I’m not writing a tutorial on decision theory because I think a simple informal understanding of expected utility maximization is sufficient for almost all practical decision making, and people who are interested in the technical details, or want to work on things like anthropic reasoning or Newcomb’s problem can easily find existing material on EDT and CDT. (I personally used the book that I linked to earlier. And I think it is useful to survey the existing literature, if only to confirm that a problem exists and hasn’t already been solved.)
But anyway, Adam Bell seems to be doing a reasonable job of explaining EDT and CDT to a non-technical audience. If he is successful, it might give people a better idea of what it is that we’re actually trying to accomplish with TDT and UDT.
Also note an addition to the post: Appendix 2. I don’t feel like going into these details here would benefit all beginners (it may benefit some but disadvantage others) but you’re right that I can at least signpost that there is an issue and people who want more details can get a bit of a start from reading these comments.
Fair enough. I understand what you’re saying and it’s ashame that this sort of introduction caused problems when you were learning decision theory. However, I feel like this is just the sort of thing that I did need to help me learn decision theory. Sometimes you need to have a flawed but simple understand before you can appreciate the flaws in your understanding. I’m the sort of person that would probably never get there if I was expected to see the issues in the naive presentation straight away.
Maybe this introduction won’t be suitable for everyone but I don’t feel like these issues mean it will be useful to no-one. However, I can see what you mean about at least signposting that there are unresolved issues. My current plan involves introducing various issues in decision theory so people at least understand what the discussion is about and then to do a state of play post which surveys issues in the field, unresolved issues and outlines why decision theory is still a wide open field.
That may go some way to resolving your concerns or you may just feel like such an approach is pointless. However, I do hope that this post will benefit some people and some types of learners even if it doesn’t benefit you personally or people who learn in the same way as you.
This depends on what you mean by “learn” and what objective you want to achieve by learning. I don’t believe in having a “flawed but simple understanding” of a math topic: people who say such things usually mean that they can recite some rehearsed explanations, but cannot solve even simple problems on the topic. Solving problems should come first, and intuitive explanations should come later.
Imagine you live in the middle ages and decide to study alchemy. So you start digging in, and after your first few lessons you happily decide to write an “intuitive introduction to alchemy techniques” so your successors can pass the initial phase more easily. I claim that this indicates a flawed mindset. If you cannot notice (“cannot be expected to see”, as you charmingly put it) that the whole subject doesn’t frigging work, isn’t your effort misplaced? How on Earth can you be satisfied with an “intuitive” understanding of something that you don’t even know works?
I apologize if my comments here sound rude or offensive. I’m honestly trying to attack what I see as a flawed approach you have adopted, not you personally. And I honestly think that the proper attitude to decision theory is to treat it like alchemy: a pre-paradigmatic field where you can hope to salvage some useful insights from your predecessors, but most existing work is almost certainly going to get scrapped.
No need to worry about being rude or offensive—I’m happy to talk about issues rather than people and I never thought we were doing anything different. However, I wonder if a better comparison is with someone studying “Ways of discovering the secrets of the universe.” If they studied alchemy and then looked at ways it failed that might be a useful way of seeing what a better theory of “discovering secrets” will need to avoid.
That’s my intention. Study CDT and see where it falls down so then we have a better sense of what a Decision Theory needs to do before exploring other approaches to decision theory. You might do the same with alchemy and you might explain its flaw. But first you have to explain what alchemy is before you can point out the issues with it. That’s what this post is doing—explaining what causal decision theory is seen to be before we look at the problems with this perception.
To look at alchemy’s flaws, you first need to know what alchemy is. Even if you can see it’s flawed from the start, that doesn’t mean a step by step process can’t be useful.
Or that’s how I feel. Further disagreement is welcome.
Sorry for deleting my comment—on reflection it sounded too harsh.
Maybe it’s just me, but I don’t think you’re promoting the greater good when you write an intuitive tutorial on a confused topic without screaming in confusion yourself. What’s the hurry, anyway? Why not make some little bits perfectly clear for yourself, and write then?
Here’s an example of an intuitive explanation (of an active research topic, no less) written by someone whose thinking is crystal clear: Cosma Shalizi on causal models. One document like that is worth a thousand “monad tutorials” written by Haskell newbies.
I don’t think you’ve sounded harsh. You obviously disagree with me but I think you’ve done so politely.
I guess my feeling is that different people learn differently and I’m not as convinced as you seem to be that this is the wrong way for all people to learn (as opposed to the wrong way for some people to learn). I grant that I could be wrong on this but I feel that I, at the very least, would gain something from this sort of tutorial. Open to be proven wrong if there’s a chorus of dissenters.
Obviously, I could write a better explanation of decision theory if I had researched the area for years and had a better grasp of it. However, that’s not the case, so I’m left to decide what should do given the experience I do have.
I am writing this hoping that doing so will benefit some people.
And doing so doesn’t stop me writing a better tutorial when I do understand the topic better. I can still do that when that time occurs and yet create something that hopefully has positive value for now.
Thx for the Shalizi link. I’m currently slogging my way through Pearl, and Shalizi clarifies things.
At first I thought that AdamBell had invented Evidential Decision Theory from whole cloth, but I discover by Googling that it really exists. Presumably it makes sense for different problems—it certainly did not for the baby-kissing story as presented.
As far as I know, there’s still no non-trivial formalization of the baby-kissing problem (aka Smoking Lesion). I’d be happy to be proved wrong on that.
In short, I feel your “explanation” doesn’t look inside black boxes, only shuffles them around.
That’s not Adam Bell’s fault. Those black boxes are inherent in CDT. You can read a CDT proponent’s formal treatment of causal probabilities here, and see for yourself.
I applaud anyone who figures out stuff for themselves and posts it for the benefit of others, but this post is extremely unclear. How do you define these funky “causal probabilities” to someone who only knows regular conditional probabilities? And how can kissing the baby be evidence for anything, if it’s determined entirely by which decision theory you adopt? In short, I feel your “explanation” doesn’t look inside black boxes, only shuffles them around. I’d prefer a more formal treatment to ensure that no lions lurk in the shadows.
Are you referring to the fact that evidential decision theories rely on conditional probability, working as follows:
Modelling probability as ways the world could be (ie. if the world can be two ways and A is true in one of them then it’s 50% probable).
Imagine the world has ten ways to be, A is true in five of them, B is true in six but A and B are only true in 2. So our probability of A is 5⁄10 = 1⁄2 Our probability of A given B is 2⁄6 = 1⁄3 Because: B being true reduced the number of ways the world can be to six. Of these, in only 2 is A true.
So, the answer to the question about how kissing the baby is evidence is as above: It’s evidence because it rules out some possible ways that the world could be. For example, there may originally have been ten worlds, five where the politician was likeable and five where they weren’t. The politician kisses the baby in only one where he’s unlikeable and in four where he’s likeable. Kissing the baby then reduces it down to five ways the world could be and in four of these he’s likeable so by kissing the baby, the probability is higher that the world is such that he is likeable.
I don’t expect that to be a breakthrough to you. I’m just asking whether that’s the sort of thing you were thinking I should have said (but better crafted).
As to the difference between conditional and causal probability, causal probabilities would be a subset of the conditional ones where “A causes B”. What it means to say “A causes B” seems beyond the scope of an introductory article to me though. Or am I missing what you mean? Is there a simple way to explain what a causal probability is at an introductory level?
I think I’ve made it obvious in my postings here that I consider I have a lot to learn. If you think you can see a way I should have done it, I’d be really interested to know what it was and I could try and edit the post or write another post to explain it.
IMO, right now decision theory is not a settled topic to write tutorials like this about. You might say that I was dissatisfied with the tone of your post: it implied that somewhere there are wise mathematicians who know what “causal probabilities” mean, etc. In truth there are no such wise mathematicians. (Well, you could mention Judea Pearl as a first approximation, but AFAIK his work doesn’t settle the issues completely, and he uses a very different formalism.) Any honest introduction should clearly demarcate the dubious parts with “here be dragons”. When I started learning decision theory, introductions like yours wasted a huge amount of my time and frustrated me to no end, because they always seemed to assume things that just weren’t there in the specialized literature.
You sound like someone who is in a position to write a great intro to DT. Would you consider doing that, or perhaps collaborating with this post’s author?
That would feel a bit like hijacking. The obvious candidates for writing such a post are Eliezer, Gary Drescher, or Wei Dai. I don’t know why they aren’t doing that, probably they feel that surveying the existing literature is quite enough to make yourself confused. I’ll think about your suggestion, though. If I find an honest and accessible angle, I’ll write a post.
I’m not writing a tutorial on decision theory because I think a simple informal understanding of expected utility maximization is sufficient for almost all practical decision making, and people who are interested in the technical details, or want to work on things like anthropic reasoning or Newcomb’s problem can easily find existing material on EDT and CDT. (I personally used the book that I linked to earlier. And I think it is useful to survey the existing literature, if only to confirm that a problem exists and hasn’t already been solved.)
But anyway, Adam Bell seems to be doing a reasonable job of explaining EDT and CDT to a non-technical audience. If he is successful, it might give people a better idea of what it is that we’re actually trying to accomplish with TDT and UDT.
Looks like this is being addressed:
http://lesswrong.com/lw/2lg/desirable_dispositions_and_rational_actions/2gec?c=1
Also note an addition to the post: Appendix 2. I don’t feel like going into these details here would benefit all beginners (it may benefit some but disadvantage others) but you’re right that I can at least signpost that there is an issue and people who want more details can get a bit of a start from reading these comments.
Fair enough. I understand what you’re saying and it’s ashame that this sort of introduction caused problems when you were learning decision theory. However, I feel like this is just the sort of thing that I did need to help me learn decision theory. Sometimes you need to have a flawed but simple understand before you can appreciate the flaws in your understanding. I’m the sort of person that would probably never get there if I was expected to see the issues in the naive presentation straight away.
Maybe this introduction won’t be suitable for everyone but I don’t feel like these issues mean it will be useful to no-one. However, I can see what you mean about at least signposting that there are unresolved issues. My current plan involves introducing various issues in decision theory so people at least understand what the discussion is about and then to do a state of play post which surveys issues in the field, unresolved issues and outlines why decision theory is still a wide open field.
That may go some way to resolving your concerns or you may just feel like such an approach is pointless. However, I do hope that this post will benefit some people and some types of learners even if it doesn’t benefit you personally or people who learn in the same way as you.
This depends on what you mean by “learn” and what objective you want to achieve by learning. I don’t believe in having a “flawed but simple understanding” of a math topic: people who say such things usually mean that they can recite some rehearsed explanations, but cannot solve even simple problems on the topic. Solving problems should come first, and intuitive explanations should come later.
Imagine you live in the middle ages and decide to study alchemy. So you start digging in, and after your first few lessons you happily decide to write an “intuitive introduction to alchemy techniques” so your successors can pass the initial phase more easily. I claim that this indicates a flawed mindset. If you cannot notice (“cannot be expected to see”, as you charmingly put it) that the whole subject doesn’t frigging work, isn’t your effort misplaced? How on Earth can you be satisfied with an “intuitive” understanding of something that you don’t even know works?
I apologize if my comments here sound rude or offensive. I’m honestly trying to attack what I see as a flawed approach you have adopted, not you personally. And I honestly think that the proper attitude to decision theory is to treat it like alchemy: a pre-paradigmatic field where you can hope to salvage some useful insights from your predecessors, but most existing work is almost certainly going to get scrapped.
No need to worry about being rude or offensive—I’m happy to talk about issues rather than people and I never thought we were doing anything different. However, I wonder if a better comparison is with someone studying “Ways of discovering the secrets of the universe.” If they studied alchemy and then looked at ways it failed that might be a useful way of seeing what a better theory of “discovering secrets” will need to avoid.
That’s my intention. Study CDT and see where it falls down so then we have a better sense of what a Decision Theory needs to do before exploring other approaches to decision theory. You might do the same with alchemy and you might explain its flaw. But first you have to explain what alchemy is before you can point out the issues with it. That’s what this post is doing—explaining what causal decision theory is seen to be before we look at the problems with this perception.
To look at alchemy’s flaws, you first need to know what alchemy is. Even if you can see it’s flawed from the start, that doesn’t mean a step by step process can’t be useful.
Or that’s how I feel. Further disagreement is welcome.
Sorry for deleting my comment—on reflection it sounded too harsh.
Maybe it’s just me, but I don’t think you’re promoting the greater good when you write an intuitive tutorial on a confused topic without screaming in confusion yourself. What’s the hurry, anyway? Why not make some little bits perfectly clear for yourself, and write then?
Here’s an example of an intuitive explanation (of an active research topic, no less) written by someone whose thinking is crystal clear: Cosma Shalizi on causal models. One document like that is worth a thousand “monad tutorials” written by Haskell newbies.
Maybe there should be a top-level post on how causal decision theory is like burritos?
I can’t believe you just wrote that. The whole burrito thing is just going to confuse people, when it’s really a very straightforward topic.
Just think of decision theory as if it were cricket...
At least that would be a change from treating decision theory as if it were all about prison.
I don’t think you’ve sounded harsh. You obviously disagree with me but I think you’ve done so politely.
I guess my feeling is that different people learn differently and I’m not as convinced as you seem to be that this is the wrong way for all people to learn (as opposed to the wrong way for some people to learn). I grant that I could be wrong on this but I feel that I, at the very least, would gain something from this sort of tutorial. Open to be proven wrong if there’s a chorus of dissenters.
Obviously, I could write a better explanation of decision theory if I had researched the area for years and had a better grasp of it. However, that’s not the case, so I’m left to decide what should do given the experience I do have.
I am writing this hoping that doing so will benefit some people.
And doing so doesn’t stop me writing a better tutorial when I do understand the topic better. I can still do that when that time occurs and yet create something that hopefully has positive value for now.
Thx for the Shalizi link. I’m currently slogging my way through Pearl, and Shalizi clarifies things.
At first I thought that AdamBell had invented Evidential Decision Theory from whole cloth, but I discover by Googling that it really exists. Presumably it makes sense for different problems—it certainly did not for the baby-kissing story as presented.
As far as I know, there’s still no non-trivial formalization of the baby-kissing problem (aka Smoking Lesion). I’d be happy to be proved wrong on that.
That’s not Adam Bell’s fault. Those black boxes are inherent in CDT. You can read a CDT proponent’s formal treatment of causal probabilities here, and see for yourself.