(2′) If A is (sufficiently) strong evidence of B, then the prior probability of A can’t be much higher than the prior probability of B.
The logic and math of this post seems very confused. It feels like you are saying “If the sun rises tomorrow, I will kill you. The probability of me being a murderer is 1:10^8, therefor the probability of the sun rising tomorrow cannot be much higher than 1:10^8”
First off, there’s some very crucial evidence you are forgetting in evaluating this case. The key element here is that numerous small bits of evidence are cumulative. This is a very important point, and one which jsteinhardt touched on already.
First, we have a very major piece of evidence: A murder did in fact occur, and the murderer must have been in Perugia at the time they committed this murder. At this point, we have approximately 10^5 possible suspects (Perugia has a population of 166,253), and we know, factually, that one of them is the guilty party. If we had no other evidence, we could reasonably assign a probability of 1:10^5 that each one is guilty. You’ll notice that this is vastly higher than the normal probability of someone being a murderer, because we already have quite a few bits of evidence.
If the burglary was faked with odds of 10^4:1, then we can assume that everyone that had a motive to do so now has a guilt probability of 10^4:10^5, or approximately 1:10. A 10% chance of Amanda Knox being guilty is certainly poor evidence, and I don’t see any reason to favor her over other people who have been demonstrated to have equal motive, but I’m also basing this entirely on this specific post.
The consequences of the burglary being faked does not change based the probability that it occurred, any more than my threat to kill you tomorrow will prevent the sun from rising. If we’re dealing with probability, then there is some factual probability that the burglary was faked, based on it’s own evidence, and this probability is entirely independent of the consequences. Further, this probability, and the probability that (Burglary Faked ⇒ Amanda is Guilty) cannot be 100%, despite your post assuming such. You cannot include impossible numbers and then expect a firm conclusion to arise.
P.S. If your point was simply “The judge is assuming impossible numbers”, then I’d feel you are probably wrong on this point. I’d be happy to elaborate if that is in fact the case.
P.P.S. You can argue that a “higher standard of evidence” for proving that may be required, based on legal and moral principles, but that has nothing at all to do with probabilities.
The logic and math of this post seems very confused. It feels like you are saying “If the sun rises tomorrow, I will kill you. The probability of me being a murderer is 1:10^8, therefor the probability of the sun rising tomorrow cannot be much higher than 1:10^8”
Well, if you knew that
(1) if the sun will rise tomorrow, then I am a murderer,
and you also knew that
(2) I am not a murderer,
then you would indeed know that
(3) the sun will not rise tomorrow.
First off, there’s some very crucial evidence you are forgetting in evaluating this case.
There is very little—certainly very little of importance—that I have forgotten about this case. And I have pretty much all of the publicly available information that exists about it at my fingertips, in case I do forget anything. So, no.
What I am aware of and what I explicitly mention in a particular post are not the same thing.
The key element here is that numerous small bits of evidence are cumulative.
While this is mathematically as beyond dispute as (say) the formulas I presented in the post, it’s worth noting that approaching something like a murder case in this way is highly dangerous, due to various cognitive biases (which of course are our subject matter here on LW). There is a serious risk of misjudging the strength of such small pieces of evidence, and compounding the error by missing dependence relations, so that you end up double-counting evidence.
But anyway, this doesn’t have much to do with this post.
The consequences of the burglary being faked does not change based the probability that it occurred, any more than my threat to kill you tomorrow will prevent the sun from rising. If we’re dealing with probability, then there is some factual probability that the burglary was faked, based on it’s own evidence, and this probability is entirely independent of the consequences
The intuition you’re describing here is exactly the one that my post aims to refute.
It might seem, as it no doubt did to Massei and Cristiani, that you should be able to establish whether the burglary was fake independently of whether Knox and Sollecito killed Kercher. After all, there isn’t much physical connection between the events in Romanelli’s room and the events in Kercher’s, is there? But this is a mistake—or at least, it is so long as you believe that establishing the burglary was fake would imply that Knox and Sollecito killed Kercher.
In principle, you certainly could establish that the burglary was fake without making any tacit assumption that Knox and Sollecito have a substantial probability of being guilty of murder; but the type of evidence you would need to do that would have to be very strong—around as strong as the evidence needed to show their guilt independently of the burglary question.
P.S. If your point was simply “The judge is assuming impossible numbers”, then I’d feel you are probably wrong on this point. I’d be happy to elaborate if that is in fact the case.
I’m not sure what you mean here, but it sounds like you perhaps think that Massei and Cristiani’s reasoning is sound. (Do you think that Knox and Sollecito are likely guilty? If so, I’d be happy to discuss that, but this post wouldn’t be the place to do it.)
P.P.S. You can argue that a “higher standard of evidence” for proving that may be required, based on legal and moral principles, but that has nothing at all to do with probabilities.
If you read the post, you’ll see that it’s pretty much entirely about probabilities.
I feel you can demonstrate quite amply that A is not sufficient proof of B, and that A=>B has not been sufficiently proven either.
However, neither of these assertions seems to be your point. You seem to be insisting that you can’t prove A, and I see absolutely no evidence of that, unless you take as given the assumption A=>B. I would certainly challenge that assumption.
Am I mistaken in this understanding of your point?
P.S. I feel the evidence suggets Knox is guilty at around a 10% chance, based solely on the evidence in this post. I do not feel a 10% chance of guilt is sufficient. I have not considered any evidence outside this post, as my interest is in the probability math, and not in the actual case itself.
P.P.S. A discussion of the dangers of cognitive biases is, I feel, entirely orthogonal to a discussion on probabilities and mathematics. Given my interest is in the math, not the case, I am going to skip over discussion of such biases.
So you don’t agree that if Knox and Sollecito faked the burglary, then they are likely guilty of murder?
I feel the evidence suggets Knox is guilty at around a 10% chance, based solely on the evidence in this post
There isn’t much evidence presented in this post—hardly any at all. (Plenty of information is linked to, of course...)
A discussion of the dangers of cognitive biases is, I feel, entirely orthogonal to a discussion on probabilities and mathematics.
Well, then I must say you’re on the wrong website!
But if your interest is more in the math than in the case, I’m not sure what you’re disagreeing with me about. It’s kind of hard to dispute the inequality
Your post is entangling three separate issues, and I think that’s making it confusing to discuss (it was certainly confusing to read!)
Mathematics:“P(A) ⇐ P(B) / P(B|A).”
No argument here.
Probability:How does the evidence A impact the probability of conclusion B?
I feel you are using entirely incorrect math for the situation, as stated in my previous posts. Just because the formula is correct, does not mean it is applicable to the problem you are trying to solve.
If A is proven, and A=>B is proven, then B is proven. The prior probability of B cannot negate the proof of A, nor the proof of A=>B, and thus has absolutely no bearing on the situation. Prior probability matters if, and only if, we are discussing p(A) and p(A=>B), at which point we still have new evidence (A, A=>B) that requires us to update to a new new probability of B.
You cannot continue to assert the prior probability of B, despite new evidence that suggests a higher or lower chance of B.
Cognitive Bias:Is the judge properly evaluating p(A) and p(A=>B)?
I feel that there is insufficient information to draw a firm conclusion here. However, based on what you have said, I feel rather strongly that you have misinterpreted his evaluations, because you are assuming that common language and logical language are the same.
If A is proven, and A=>B is proven, then B is proven
Agreed.
The prior probability of B cannot negate the proof of A, nor the proof of A=>B, and thus has absolutely no bearing on the situation
This sentence doesn’t make sense as written. I don’t know what it means for a probability to “negate” a proof, and so I don’t know what you’re trying to say when you assert that this can’t happen.
My best guess is that you’re trying to say that “even if P(A) is small on account of P(B) being small, some finite amount of evidence will still suffice to prove A, and therefore B.” Which is obviously true, and nothing I have written says otherwise.
You cannot continue to assert the prior probability of B, despite new evidence that suggests a higher or lower chance of B.
This sounds like our previous discussion, where you said, and I agreed, that other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary. I’ve never disputed this, but have pointed out that this isn’t Massei and Cristiani’s reasoning. They attempted to prove the fake burglary without invoking the other murder evidence.
However, based on what you have said, I feel rather strongly that you have misinterpreted his evaluations, because you are assuming that common language and logical language are the same.
Ahhh, you make so much more sense when you phrase it this way!
“other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary”
But my point is, this is backwards. It only works if you assume with near-100% certainty that faking the burglary and being the murderer are correlated. Otherwise “faked the burglary” IS simply evidence that Knox is the murderer.
If we prove that Knox killed Kercher, it proves that any 100% correlation is true. It does NOT prove any less-than-100% correlation. It’s even entirely possible for a correlation to be one-directional (A implies B, but B does not imply A).
Thus, Knox killed Kercher is only proof of a faked burglary if you already assume the correlation is proven and two-directional.
I’m just trying to understand your point a bit better. Hopefully you don’t mind the late reply (I’ve been on vacation for a while)
“In probability, “correlations” are always bidirectional.”
Can’t there be three separate, equally valid points which, if proven, would prove she was the murderer? Even if those three equally valid proofs of her guilt are contradictory? Once we know she is guilty, they can’t all three be true, can they?
I’m not sure how one would accurately express this, given what you’re saying. The probability that A implies Guilt, B implies Guilt, and C implies Guilt can all be 100%, yes? Obviously, the probability that guilt implies all of A+B+C is 0%, since they are contradictory. Therefor, how can it be correct to assume the opposite correlation, that Guilt implies A at 100% certainty?
In general it is not true that P(A|B) = P(B|A). P(A|Guilt) depends on the prior probabilities of A and Guilt, as well as P(Guilt|A). For example, say we have four possible proofs A, B, C, D, and P(Guilt|A or B or C) = 1, and P(Guilt|D) = 0. Our prior is all four are equally likely: P(A) = P(B) = P(C) = P(D) = 0.25. P(Guilt) is then 0.75 = P(Guilt|A)P(A) + P(Guilt|B)P(B)...
P(A|Guilt) isn’t 1. But it’s 33%, which is still higher than the prior %25: that is, Guilt is evidence for A.
By the way I think it might help if you avoid talking in proofs and implication and 100% certainty. In hypothetical examples it’s useful to set things to P(X) = 1, but in the real world evidence is always probabilistic; nothing’s ever 100%.
Ahhh, that helps clear things up. For some reason I’d been understanding you as saying that, given P(Guilt|A) = 1, P(A|Guilt) was also 1. It looks like what you meant was just that Guilt is evidence for, but not necessarily 100% proof of, A. Am I getting that all correct?
Theorem: If A is evidence of B, then B is also evidence of A.
Proof: To say that A is evidence of B means that P(A|B) > P(A|~B), or in other words that
P(A&B)/P(B) > P(A&~B)/P(~B), which we may write as P(A&B)/P(B) > (P(A)-P(A&B))/(1-P(B)). Algebraic manipulation turns this into P(A&B) > P(A)P(B), which is symmetric in A and B; hence we can undo the manipulations with the roles of A and B reversed to arrive back at P(B|A) > P(B|~A). QED.
Hence, if A implies B, then B also implies A!
Now of course, the strengths of these implications might be vastly different. But that’s a separate matter.
Here, the point is that A implies B with near certainty (where A is “K&S faked burglary” and B is “K&S killed Kercher”); I’m not terribly concerned with how strongly B implies A. I don’t need for B to imply A very strongly to make my point, but Massei and Cristiani would definitely need that in order to enable any charitable reading of their burglary section at all.
(2′) If A is (sufficiently) strong evidence of B, then the prior probability of A can’t be much higher than the prior probability of B.
The logic and math of this post seems very confused. It feels like you are saying “If the sun rises tomorrow, I will kill you. The probability of me being a murderer is 1:10^8, therefor the probability of the sun rising tomorrow cannot be much higher than 1:10^8”
First off, there’s some very crucial evidence you are forgetting in evaluating this case. The key element here is that numerous small bits of evidence are cumulative. This is a very important point, and one which jsteinhardt touched on already.
First, we have a very major piece of evidence: A murder did in fact occur, and the murderer must have been in Perugia at the time they committed this murder. At this point, we have approximately 10^5 possible suspects (Perugia has a population of 166,253), and we know, factually, that one of them is the guilty party. If we had no other evidence, we could reasonably assign a probability of 1:10^5 that each one is guilty. You’ll notice that this is vastly higher than the normal probability of someone being a murderer, because we already have quite a few bits of evidence.
If the burglary was faked with odds of 10^4:1, then we can assume that everyone that had a motive to do so now has a guilt probability of 10^4:10^5, or approximately 1:10. A 10% chance of Amanda Knox being guilty is certainly poor evidence, and I don’t see any reason to favor her over other people who have been demonstrated to have equal motive, but I’m also basing this entirely on this specific post.
The consequences of the burglary being faked does not change based the probability that it occurred, any more than my threat to kill you tomorrow will prevent the sun from rising. If we’re dealing with probability, then there is some factual probability that the burglary was faked, based on it’s own evidence, and this probability is entirely independent of the consequences. Further, this probability, and the probability that (Burglary Faked ⇒ Amanda is Guilty) cannot be 100%, despite your post assuming such. You cannot include impossible numbers and then expect a firm conclusion to arise.
P.S. If your point was simply “The judge is assuming impossible numbers”, then I’d feel you are probably wrong on this point. I’d be happy to elaborate if that is in fact the case.
P.P.S. You can argue that a “higher standard of evidence” for proving that may be required, based on legal and moral principles, but that has nothing at all to do with probabilities.
First of all, Welcome to Less Wrong!
Well, if you knew that
(1) if the sun will rise tomorrow, then I am a murderer,
and you also knew that
(2) I am not a murderer,
then you would indeed know that
(3) the sun will not rise tomorrow.
There is very little—certainly very little of importance—that I have forgotten about this case. And I have pretty much all of the publicly available information that exists about it at my fingertips, in case I do forget anything. So, no.
What I am aware of and what I explicitly mention in a particular post are not the same thing.
While this is mathematically as beyond dispute as (say) the formulas I presented in the post, it’s worth noting that approaching something like a murder case in this way is highly dangerous, due to various cognitive biases (which of course are our subject matter here on LW). There is a serious risk of misjudging the strength of such small pieces of evidence, and compounding the error by missing dependence relations, so that you end up double-counting evidence.
But anyway, this doesn’t have much to do with this post.
The intuition you’re describing here is exactly the one that my post aims to refute.
It might seem, as it no doubt did to Massei and Cristiani, that you should be able to establish whether the burglary was fake independently of whether Knox and Sollecito killed Kercher. After all, there isn’t much physical connection between the events in Romanelli’s room and the events in Kercher’s, is there? But this is a mistake—or at least, it is so long as you believe that establishing the burglary was fake would imply that Knox and Sollecito killed Kercher.
In principle, you certainly could establish that the burglary was fake without making any tacit assumption that Knox and Sollecito have a substantial probability of being guilty of murder; but the type of evidence you would need to do that would have to be very strong—around as strong as the evidence needed to show their guilt independently of the burglary question.
I’m not sure what you mean here, but it sounds like you perhaps think that Massei and Cristiani’s reasoning is sound. (Do you think that Knox and Sollecito are likely guilty? If so, I’d be happy to discuss that, but this post wouldn’t be the place to do it.)
If you read the post, you’ll see that it’s pretty much entirely about probabilities.
I feel you can demonstrate quite amply that A is not sufficient proof of B, and that A=>B has not been sufficiently proven either.
However, neither of these assertions seems to be your point. You seem to be insisting that you can’t prove A, and I see absolutely no evidence of that, unless you take as given the assumption A=>B. I would certainly challenge that assumption.
Am I mistaken in this understanding of your point?
P.S. I feel the evidence suggets Knox is guilty at around a 10% chance, based solely on the evidence in this post. I do not feel a 10% chance of guilt is sufficient. I have not considered any evidence outside this post, as my interest is in the probability math, and not in the actual case itself.
P.P.S. A discussion of the dangers of cognitive biases is, I feel, entirely orthogonal to a discussion on probabilities and mathematics. Given my interest is in the math, not the case, I am going to skip over discussion of such biases.
So you don’t agree that if Knox and Sollecito faked the burglary, then they are likely guilty of murder?
There isn’t much evidence presented in this post—hardly any at all. (Plenty of information is linked to, of course...)
Well, then I must say you’re on the wrong website!
But if your interest is more in the math than in the case, I’m not sure what you’re disagreeing with me about. It’s kind of hard to dispute the inequality
%20\leq%20\frac{P(B)}{P(B%7CA)})isn’t it?
Your post is entangling three separate issues, and I think that’s making it confusing to discuss (it was certainly confusing to read!)
Mathematics: “P(A) ⇐ P(B) / P(B|A).”
No argument here.
Probability: How does the evidence A impact the probability of conclusion B?
I feel you are using entirely incorrect math for the situation, as stated in my previous posts. Just because the formula is correct, does not mean it is applicable to the problem you are trying to solve.
If A is proven, and A=>B is proven, then B is proven. The prior probability of B cannot negate the proof of A, nor the proof of A=>B, and thus has absolutely no bearing on the situation. Prior probability matters if, and only if, we are discussing p(A) and p(A=>B), at which point we still have new evidence (A, A=>B) that requires us to update to a new new probability of B.
You cannot continue to assert the prior probability of B, despite new evidence that suggests a higher or lower chance of B.
Cognitive Bias: Is the judge properly evaluating p(A) and p(A=>B)?
I feel that there is insufficient information to draw a firm conclusion here. However, based on what you have said, I feel rather strongly that you have misinterpreted his evaluations, because you are assuming that common language and logical language are the same.
Agreed.
This sentence doesn’t make sense as written. I don’t know what it means for a probability to “negate” a proof, and so I don’t know what you’re trying to say when you assert that this can’t happen.
My best guess is that you’re trying to say that “even if P(A) is small on account of P(B) being small, some finite amount of evidence will still suffice to prove A, and therefore B.” Which is obviously true, and nothing I have written says otherwise.
This sounds like our previous discussion, where you said, and I agreed, that other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary. I’ve never disputed this, but have pointed out that this isn’t Massei and Cristiani’s reasoning. They attempted to prove the fake burglary without invoking the other murder evidence.
You’ll have to be more specific here.
Ahhh, you make so much more sense when you phrase it this way!
“other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary”
But my point is, this is backwards. It only works if you assume with near-100% certainty that faking the burglary and being the murderer are correlated. Otherwise “faked the burglary” IS simply evidence that Knox is the murderer.
If we prove that Knox killed Kercher, it proves that any 100% correlation is true. It does NOT prove any less-than-100% correlation. It’s even entirely possible for a correlation to be one-directional (A implies B, but B does not imply A).
Thus, Knox killed Kercher is only proof of a faked burglary if you already assume the correlation is proven and two-directional.
In probability, “correlations” are always bidirectional. Bayes theorem:
P(A|B =\frac{P(B|A)P(A)}{P(B)})
If P(B|A) > P(B), then P(A|B) > P(A). By the same factor even:
}{P(A)}=\frac{P(B|A)}{P(B)})The analogy to biconditionality in deductive logic would be P(A|B)= P(B|A) which obviously isn’t always true.
I’m just trying to understand your point a bit better. Hopefully you don’t mind the late reply (I’ve been on vacation for a while)
“In probability, “correlations” are always bidirectional.”
Can’t there be three separate, equally valid points which, if proven, would prove she was the murderer? Even if those three equally valid proofs of her guilt are contradictory? Once we know she is guilty, they can’t all three be true, can they?
I’m not sure how one would accurately express this, given what you’re saying. The probability that A implies Guilt, B implies Guilt, and C implies Guilt can all be 100%, yes? Obviously, the probability that guilt implies all of A+B+C is 0%, since they are contradictory. Therefor, how can it be correct to assume the opposite correlation, that Guilt implies A at 100% certainty?
It isn’t!
In general it is not true that P(A|B) = P(B|A). P(A|Guilt) depends on the prior probabilities of A and Guilt, as well as P(Guilt|A). For example, say we have four possible proofs A, B, C, D, and P(Guilt|A or B or C) = 1, and P(Guilt|D) = 0. Our prior is all four are equally likely: P(A) = P(B) = P(C) = P(D) = 0.25. P(Guilt) is then 0.75 = P(Guilt|A)P(A) + P(Guilt|B)P(B)...
Given this, we have:
&=\frac{P(Guilt%7CA)P(A)}{P(Guilt)}\\&=\frac{1.0*0.25}{0.75}\\&=\frac{1}{3}\end{aligned})P(A|Guilt) isn’t 1. But it’s 33%, which is still higher than the prior %25: that is, Guilt is evidence for A.
By the way I think it might help if you avoid talking in proofs and implication and 100% certainty. In hypothetical examples it’s useful to set things to P(X) = 1, but in the real world evidence is always probabilistic; nothing’s ever 100%.
Ahhh, that helps clear things up. For some reason I’d been understanding you as saying that, given P(Guilt|A) = 1, P(A|Guilt) was also 1. It looks like what you meant was just that Guilt is evidence for, but not necessarily 100% proof of, A. Am I getting that all correct?
Yes.
P(Guilt|A) = P(A|Guilt) only when P(A) = P(Guilt). In which case it would be 100% proof. But that is a rare situation.
Nitpick: the two conditionals also be equal if A and Guilt were mutually exclusive. (in that case, of course, the two conditionals would be both zero)
Theorem: If A is evidence of B, then B is also evidence of A.
Proof: To say that A is evidence of B means that P(A|B) > P(A|~B), or in other words that P(A&B)/P(B) > P(A&~B)/P(~B), which we may write as P(A&B)/P(B) > (P(A)-P(A&B))/(1-P(B)). Algebraic manipulation turns this into P(A&B) > P(A)P(B), which is symmetric in A and B; hence we can undo the manipulations with the roles of A and B reversed to arrive back at P(B|A) > P(B|~A). QED.
Hence, if A implies B, then B also implies A!
Now of course, the strengths of these implications might be vastly different. But that’s a separate matter.
Here, the point is that A implies B with near certainty (where A is “K&S faked burglary” and B is “K&S killed Kercher”); I’m not terribly concerned with how strongly B implies A. I don’t need for B to imply A very strongly to make my point, but Massei and Cristiani would definitely need that in order to enable any charitable reading of their burglary section at all.