Frank: It is impossible for A and ~A to both be evidence for B. If a lack of sabotage is evidence for a fifth column, then an actual sabotage event must be evidence against a fifth column. Obviously, had there been an actual instance of sabotage, nobody would have thought that way- they would have used the sabotage as more “evidence” for keeping the Japanese locked up. It’s the Salem witch trials, only in a more modern form- if the woman/Japanese has committed crimes, this is obviously evidence for “guilty”; if they are innocent of any wrongdoing, this too is a proof, for criminals like to appear especially virtuous to gain sympathy.
As I understand it, there were at least three hypotheses under consideration:
a) No members (or a negligibly small fraction) of the ethnic group in question will make any attempt at sabotage.
b) There will be attempts at sabotage by members of the ethnic group in question, but without any particular organization or coordination.
c) There is a well-disciplined covert organization which is capable of making strategic decisions about when and where to commit acts of sabotage.
The prior for A was very low, and any attempt by the Japanese government to communicate with saboteurs in the States could be considered evidence against it. Lack of sabotage is evidence for C over B.
Lack of sabotage is obviously evidence for a fifth column trying to lull the government, given the fifth column exists, since the opposite—sabotage occuring—is very strong evidence against that.
However lack of sabotage is still much stronger evidence towards the fifth column not existing.
The takeaway is that if you are going to argue that X group is dangerous because they will commit Y act, you cannot use a lack of Y as weak evidence that X exists, because then Y would be strong evidence that X does not exist, and Y is what you are afraid X is going to do!
You would be much better off using the fact that no sabotage occurred as weak evidence that the 5th column was preventing sabotage.
If there is other evidence that suggests the 5th column exists and that they are dangerous, that is the evidence that should be used. Making up non-evidence (which is actually counter evidence) is not the way to go about it. There are ways of handling court cases that must remain confidential (though it would certainly make the court look bad, it is the right way to do it).
I think you’re right, but there’s an adjustment (an update, isn’t it called?) warranted in two directions.
The absence of sabotage decreases the likelihood of the fifth column existing at all.
But if there is a fifth column, it could be reasonably predicted that there would be evidence of sabotage unless there was an attempt to keep a low profile.
If they were to favor this hypothesis for other reasons, as in the classified data mentioned by Frank, then the lack of apparent sabotage would also increase the probability that if the unlikely fifth column DID exist, it would be one which is keeping a low profile.
I grant, of course, at the same time, the decreased probability of there being any kind of fifth column in the first place.
It is impossible for A and ~A to both be evidence for B. If a lack of sabotage is evidence for a fifth column, then an actual sabotage event must be evidence against a fifth column.
This is not correct.
One explanation (call it A) for why there fails to be sabotage is that the Fifth Column is trying to be sneaky and inflict maximum damage later on when no one expects it. The probability of that is greater than 0, so it is a legitimate potential explanation for the apparent absence of sabotage. But, on further thought, there is this other possible explanation (call it B): the absence of a Fifth Column will produce an absence of sabotage. The probability of this is also greater than 0.
So here we have the event (Fifth Column exists) constituting evidence for (absence of sabotage) (perhaps the probability is low, but not zero). Surely it is fair to take it for granted that ~(Fifth Column exists) also constitutes evidence for (absence of sabotage). So that’s an example where an event and its negation can potentially both be evidence for something.
I think what you really mean to say is that since P(no sabotage) = P(no sabotage | Fifth Column) P(Fifth Column) + P(no sabotage | no Fifth Column) P(no Fifth Column), and since no sabotage has been observed, making P(no sabotage) = 1, this must imply that P(no sabotage | Fifth Column) P(Fifth Column) = 1 - P(no sabotage | no Fifth Column) P(no Fifth Column).
If we then make the (perhaps unwarranted) assumption that the prior probabilities are equal, i.e. P(Fifth Column) = P(no Fifth Column), then when deciding via a maximum a posterior decision rule which hypothesis to believe, we wind up with P(no sabotage | Fifth Column) = 1 - P(no sabotage | no Fifth Column), and thus we simply select the hypothesis corresponding to whichever conditional probability is larger… and from this, our intuitions about basic logic would say it doesn’t make sense to assign probabilities in such a way that (no Fifth Column) is less likely to cause no sabotage than (Fifth Column), and this is what creates the effect you are noting that some event A and its negation ~A shouldn’t both be evidence for the same thing.
Compactly, it’s only fair to claim that A and ~A cannot both be evidence for B in some very special situations. In general, though, A and ~A definitely can both serve as supporting evidence for B, it’s just that they will corroborate B to different degrees and this may or may not be further offset by the prior distributions of A and ~A.
But it is important not to assert the incorrect generalization that “It is impossible for A and ~A to both be evidence for B.”
Did you take the other replies to Tom McCabe’s comment, which raise the same question you do but offer the opposite answer, into consideration? The appeal to intuition that a fifth column might be refraining from sabotage in order to create more effective sabotage later does not let you take both A and ~A as evidence for B. Any way you verbally justify it, you will still be dutch-bookable and incoherent.
Without losing the generality of the theorems of probability, let me address your particular narrative: If you believe that, if a fifth column exists, it is of the type that will assuredly refrain from sabotage now in order to prepare a more devastating strike later; then observing sabotage (or no sabotage) cannot alter your probability that a fifth column exists.
Without losing the generality of the theorems of probability, let me address your particular narrative: If you believe that, if a fifth column exists, it is of the type that will assuredly refrain from sabotage now in order to prepare a more devastating strike later;
This is a fancy way of saying that if you assume that the fifth column’s intent is totally independent of the observance of sabotage. P(A | B ) = P(A). That is, no evidence can update your position along the lines of Bayes’ theorem.
This is not what I am saying. I am saying that P(A |B) and P(A | ~B) can both be nonzero, and in the Bayesian sense this is what is meant by evidence. Either observing sabotage or failing to observe sabotage can, strictly speaking, corroborate the belief that there is a secret Fifth Column. If you make the further assumption that the actions of the Fifth Column are independent from your observations about sabotage, then yes, everything you said is correct.
My only point is that, in general, you cannot say that it is a rule of probability that A and ~A cannot both be evidence for B. You must be talking about specific assumptions involving independence for that to hold.
It also makes sense to think orthogonally about A and ~A in the following sense: if these are my only two hypotheses, then if there is any best decision, it is because under some decision rule, either A or ~A maximizes the a posteriori probability, but not both. If the posterior was equi-probable (50/50) for the hypotheses, then observing or not observing sabotage would change nothing. This could happen if you make the independence assumption above, but even if you don’t, it could still happen that the priors and conditional probabilities just work out to that particular case, and there would be no optimal belief in the Bayesian sense.
For a concrete example, suppose I flip a coin and if it is Heads, I will eat a tuna sandwich with probability 3⁄4 and a chicken sandwich with probability 1⁄4, and if it is Tails I will eat a turkey sandwich with probability 3⁄4 and a chicken sandwich with probability 1⁄4. Now suppose you only get to see what sandwich I select and then must make your best guess about what the coin showed. If I select a chicken sandwich, then you would believe that either Heads or Tails could serve as evidence for this decision. Neither result would be surprising to you (i.e., neither result would change your model) if you learned of it after I selected a chicken sandwich.
In this case, both A and ~A can serve as evidence for chicken, to the tune of 1⁄4 in each case. A is much stronger evidence for tuna, ~A is much stronger evidence for turkey, but both, to some extent, are evidence of chicken.
I’m not disagreeing with your claim about probability theory at all. I’m just saying that we don’t know that Warren made the assumption that his observations about sabotage were independent from the existence of a Fifth Column. For all we know, it was just that he had such a strong prior belief (which may or may not have been rational in itself) that there was a Fifth Column, that even after observing no sabotage, his decision rule was still in favor of belief in the Fifth Column.
It’s not that he mistakenly thought that the Fifth Column would definitely act in one way or the other. It’s just that both no sabotage and sabotage were, to some degree, compatible with his strong prior that there was a Fifth Column… enough so that after converting it to a posterior it didn’t cause him to change his position.
A is evidence for B if P(B|A) > P(B). That is to say, learning A increases your belief in B. It is a fact from probability theory that P(B) = P(B|A)P(A) + P(B|¬A)P(¬A). If P(B|A) > P(B) and P(B|¬A) > P(B) then that says that:
P(B) > P(B)P(A) + P(B)P(¬A)
P(B) > P(B)(P(A) + P(¬A))
P(B) > P(B)
SInce A and ¬A are exhaustive and exclusive (so P(A) + P(¬A) = 1) this is a contradiction.
On the other hand, P(B|A) and P(B|¬A) being nonzero just means both A and ¬A are consistent with B—that is, A and ¬A are not disproofs of B.
You definitions do not match mine, which come from here :
The key data-dependent term Pr(D | M) is a likelihood, and is sometimes called the evidence for model or hypothesis, M; evaluating it correctly is the key to Bayesian model comparison. The evidence is usually the normalizing constant or partition function of another inference, namely the inference of the parameters of model M given the data D.
The evidence for the hypothesis M is Pr(D | M), regardless of whether or not Pr(D) > Pr(D | M), at least according to that page and this statistics book sitting here at my desk (pages 184-186), and perhaps other sources.
If it’s just a war over definitions, then it’s not worth arguing. My point is that it’s misleading to act like that attribute you call ‘consistency’ doesn’t play a role in what could fuel reasoning like Warren’s above. It’s not about independence assumptions or mistakes about what can be evidence (do you really think Warren cared about the technical, Bayesian definition of evidence in his thinking?). It’s about understanding a person’s formation of prior probabilities in addition to the method by which they convert them to posteriors.
You’ve used “evidence” to refer to the probability P(D | M). We’re talking about the colloquial use of “evidence for the hypothesis” meaning an observation that increases the probability of the hypothesis. This is the sense in which we’ve been using “evidence” in the OP.
If you draw 5 balls from an urn, and they’re all red, that’s evidence for the hypothesis that the next ball will be red, and so you conclude that the next one could be red, with a bit more certainty than you had before. If you draw 5 balls from an urn, and they’re blue, that’s evidence against the hypothesis that the next one will be red, so you conclude that the next one is less likely to be red than you thought before.
Your thought processes are wrong by the bayesian proof, however, if every sequence of 5 balls leads you to increase your belief that the next one will be red.
This is essentially what Warren did. If he observed sabotage he would have increased his belief in the existence of a fifth column, and yet, observing no sabotage he also increased his belief in the existence of a fifth column. Clearly, somewhere he’s made a mistake.
I see your point and I think we mostly agree about everything. My only slight extra point is to suggest that perhaps Warren was trying to use his prior beliefs to predict an explanation for absence of sabotage, rather than trying to use absence of sabotage to intensify his prior beliefs. In retrospect, it’s likely that you’re right about Warren and the quote makes it seem that he did, in fact, think that absence of sabotage increased likelihood of Fifth Column. But in general, though, I think a lot of people make a mistake that has more to do with starting out with an unreasonable prior, or making assumptions that their prior belief is independent of observations, than it has to do with a logical fallacy about letting conditioning on both A and ~A increase the probability of B.
The reply that it “is impossible for A and ~A to both be evidence for B” is to ignore what Frank said in favor of insisting on the very overgeneralization I think he was trying to point out. It’s not impossible at all when we are being imprecise enough about the prior expectations involved, such as when we lump all moments in a sustained effort together.
Here’s an example to illustrate what I’m saying: Say you are a parent of a 10 year old boy who generally wants to stay up past his bedtime. His protests vary from occasional temper tantrums to the usual slumped-shoulders expression of disappointment that bedtime has finally arrived. Under normal circumstances, the expectation is that he would give at least some evidence of wanting to stay up later. We’ll call this resistance “A,” and A is evidence for “B”: his desire and motivation to stay up later. What shall we say when ~A happens? That is, what shall we say when the boy one day suddenly goes enthusiastically to bed? That he has given up his desire? That it is impossible for this to be evidence of his continued desire and motivation? Of course not. It is exactly what we would expect a motivated and reasonably intelligent person to do: try different and probably more effective strategies. If we generalize the ongoing experience of the little boy’s quest to say up later, A and ~A are both evidence of B. “It is impossible for A and ~A to both be evidence for B” is simply not narrow enough to be a true statement, and using it in that way can easily amount to a bad counterargument.
Rather, we need to be specific about each situation. What I think we should pay attention to here is the prior expectation of B. With a high enough prior, A and ~A could either (but not both) be evidence of B. But if we are not being specific to each precise situation, the generalization “it is impossible for A and ~A to both be evidence of B” can be a very subtle straw man, because the person being argued against may not be relying on the assumption that A and ~A are equal evidence for B at the same time and in the same situation.
Returning to the Japanese Fifth Column argument, unlike the little boy in my example the Japanese (and, in general, descendants from countries that go to war with their current country of citizenship) do not have a consistent track record of wartime sabotage. Also, there isn’t any reason to think they would not generally be more loyal to their country of citizenship than the country of their parents, grandparents, or even of their own childhood. So there is no particularly strong expectation that they would commit sabotage… and thus no such expectation that some mysterious lack of sabotage is itself a sign of a new strategic attempt as part of a sustained effort. The argument should come down to the prior expectation of Japanese sabotage. That seems to be the crux of it to me.
It seems to me the weakness in Frank’s argument also lies in the basic premise that we should expect the Japanese to commit sabotage. And I believe the governor would need to rely on that premise, or a similar one, in order to sustain his argument beyond what Eliezer presented.
But. The inside information premise seems nearly undefeatable to me. We can’t comment on information we don’t have. I think that is always a possibility with controversial official responses that most people would prefer to deny. If the person whose claims you are evaluating has secret but pertinent information you don’t have access to, then it can be very difficult to offer a fair analysis. For one, you will have very different yet subjectively valid prior expectations.
If we generalize the ongoing experience of the little boy’s quest to say up later, A and ~A are both evidence of B.
You seem to be using “evidence of X” to mean something along the lines of “consistent with X”. That’s not what it means in this context.
An event is evidence for or against a scenario insofar as it changes your subjective probability estimate for that scenario. Your example child going enthusiastically to bed is in fact evidence that he’s changed his mind about staying up past his bedtime: it makes that scenario subjectively more plausible, even though it’s still probably a long-shot option given what you know. It might simultaneously be evidence for some new bedtime-avoidance scheme, but that’s entirely consistent with it also pointing to a possible change of heart: the increased probability of both is made up for in the reduced probability of him continuing with his old behavior.
Subjective probabilities for either/or scenarios have to sum to unity, and so evidence for one such option has to be balanced out by evidence against one or more of the others. A and ~A cannot both be evidence for a given scenario; at best they can both leave it unaffected.
I think I understand that a little better now. So thank you for taking the time to explain that to me.
Even so, it seems all I must do is add to my counterexample a prior track record of the little boy changing strategies while pretending to go along with authority. Reconsidering my little boy example with that in mind, does that change your reply?
Also, I fail to see how your response ameliorates my objection to the claim “it is impossible for A and ~A to both be evidence for B.” By your own explanation, they are both evidence, albeit offering unequal relative probabilities (forgive me if I’m getting the password wrong there, but I think you can surmise what it is I’m getting at). Maybe if we say that “It is impossible for A and ~A to both offer the same relative probability for B at the same time and concerning the same situation and given the same subjective view of the facts, etc,” we have something that doesn’t lead us to claim things that are not true about someone else’s argument, as in the case above, that their argument depends on A and ~A at the same time and in the same way, when the precise claim in question is actually that A can be evidence for B in one situation; and based upon the expectation set upon the observance of subsequent facts, at some later date, ~A could also end up being evidence for B. I’m not sure if I’ve explained that clearly, but I’ll keep trying until either I get what I’m missing, or I manage to express clearly what may well be coming out as gibberish. Either way, I get a little slice of the self-improvement I’m looking for.
Thanks again, and I hope you can forgive my wet ears on this and bear with me. The benefits of our exchanges here will probably be pretty one sided; I have almost nothing to offer a more experienced rationalist here, and lots to gain… and I realize that, so bear with me, and please know I am grateful for the feedback.
based upon the expectation set upon the observance of subsequent facts, at some later date, ~A could also end up being evidence for B
Here’s a contradiction with A and ~A both being evidence for the same thing. You could tell your spouse “Go up and check if little Timmy went to bed”. Before ze comes back you already have an estimate of how likely Timmy is to go to bed on time (your prior belief). But then your spouse, who was too tired to climb the stairs, comes back and tells you “Little Timmy may or may not have gone to bed”. Now, if both of those possibilities would be evidence of Timmy’s staying up late then you should update your belief accordingly. But how can you do that without receiving any new information?
Yes. I get that. We cannot use A and ~A to update our estimates in the same way at the same time. That’s not the same as saying that it is impossible for A and ~A to be evidence of the same thing. One could work on Tuesday, and the other could work on Friday, depending on the situation. That was my only point: can’t generalize a timeline but need to operate at specific points on that timeline. That goes back to the justification for interning Japanese citizens. If we say ~A just can’t ever be evidence of B because at some previous time A was evidence for B, then we are making a mistake. At some later date, ~A could end up being better evidence, depending on the situation. My point was that a better counterargument to the governor’s justification is to point out that the prospect of naturalized citizens turning against their home country in favor of their country of ancestry presents a very low prior, because the Japanese (and other groups that polyglot nations have gone to war with) have not usually behaved that way in the past. I could be wrong, but it doesn’t have anything to do with updating estimates with a variable and its negation to reach the same probability at the same time. I pretty much agree with what you said, just not the implication that it conflicts in some way with what I said.
I think the brief answer to this point is that it is very important to define the hypothesis precisely, to avoid being confused in the way you describe.
Applying that lesson to Earl Warren, we can say that he failed to distinguish between motive for big-sabotage and motive for little-sabotage. Lack of little-sabotage events is evidence in favor of motive-for-big-sabotage (and for no-motive-to-sabotage: with the benefit of hindsight, we know this was the true state of the world). But unclear phrasing by Warren made it sound like he believed absence of little-sabotage was evidence of motive-to-little-sabotage, which is a nonsensical position.
Given the low probability of big-sabotage (even after incorporating the evidence Warren puts forth), it’s pretty clear that the argument for Warren’s suggested policy (Japanese-American internment) depended pretty heavily on the confused thinking created by this equivocation.
Frank: It is impossible for A and ~A to both be evidence for B. If a lack of sabotage is evidence for a fifth column, then an actual sabotage event must be evidence against a fifth column. Obviously, had there been an actual instance of sabotage, nobody would have thought that way- they would have used the sabotage as more “evidence” for keeping the Japanese locked up. It’s the Salem witch trials, only in a more modern form- if the woman/Japanese has committed crimes, this is obviously evidence for “guilty”; if they are innocent of any wrongdoing, this too is a proof, for criminals like to appear especially virtuous to gain sympathy.
As I understand it, there were at least three hypotheses under consideration: a) No members (or a negligibly small fraction) of the ethnic group in question will make any attempt at sabotage. b) There will be attempts at sabotage by members of the ethnic group in question, but without any particular organization or coordination. c) There is a well-disciplined covert organization which is capable of making strategic decisions about when and where to commit acts of sabotage.
The prior for A was very low, and any attempt by the Japanese government to communicate with saboteurs in the States could be considered evidence against it. Lack of sabotage is evidence for C over B.
BTW, what would you consider evidence for a genuine attempt to lull the government into a false sense of security (in an analagous situation)?
Lack of sabotage is obviously evidence for a fifth column trying to lull the government, given the fifth column exists, since the opposite—sabotage occuring—is very strong evidence against that.
However lack of sabotage is still much stronger evidence towards the fifth column not existing.
The takeaway is that if you are going to argue that X group is dangerous because they will commit Y act, you cannot use a lack of Y as weak evidence that X exists, because then Y would be strong evidence that X does not exist, and Y is what you are afraid X is going to do!
You would be much better off using the fact that no sabotage occurred as weak evidence that the 5th column was preventing sabotage.
If there is other evidence that suggests the 5th column exists and that they are dangerous, that is the evidence that should be used. Making up non-evidence (which is actually counter evidence) is not the way to go about it. There are ways of handling court cases that must remain confidential (though it would certainly make the court look bad, it is the right way to do it).
I think you’re right, but there’s an adjustment (an update, isn’t it called?) warranted in two directions.
The absence of sabotage decreases the likelihood of the fifth column existing at all.
But if there is a fifth column, it could be reasonably predicted that there would be evidence of sabotage unless there was an attempt to keep a low profile. If they were to favor this hypothesis for other reasons, as in the classified data mentioned by Frank, then the lack of apparent sabotage would also increase the probability that if the unlikely fifth column DID exist, it would be one which is keeping a low profile. I grant, of course, at the same time, the decreased probability of there being any kind of fifth column in the first place.
This is not correct.
One explanation (call it A) for why there fails to be sabotage is that the Fifth Column is trying to be sneaky and inflict maximum damage later on when no one expects it. The probability of that is greater than 0, so it is a legitimate potential explanation for the apparent absence of sabotage. But, on further thought, there is this other possible explanation (call it B): the absence of a Fifth Column will produce an absence of sabotage. The probability of this is also greater than 0.
So here we have the event (Fifth Column exists) constituting evidence for (absence of sabotage) (perhaps the probability is low, but not zero). Surely it is fair to take it for granted that ~(Fifth Column exists) also constitutes evidence for (absence of sabotage). So that’s an example where an event and its negation can potentially both be evidence for something.
I think what you really mean to say is that since P(no sabotage) = P(no sabotage | Fifth Column) P(Fifth Column) + P(no sabotage | no Fifth Column) P(no Fifth Column), and since no sabotage has been observed, making P(no sabotage) = 1, this must imply that P(no sabotage | Fifth Column) P(Fifth Column) = 1 - P(no sabotage | no Fifth Column) P(no Fifth Column).
If we then make the (perhaps unwarranted) assumption that the prior probabilities are equal, i.e. P(Fifth Column) = P(no Fifth Column), then when deciding via a maximum a posterior decision rule which hypothesis to believe, we wind up with P(no sabotage | Fifth Column) = 1 - P(no sabotage | no Fifth Column), and thus we simply select the hypothesis corresponding to whichever conditional probability is larger… and from this, our intuitions about basic logic would say it doesn’t make sense to assign probabilities in such a way that (no Fifth Column) is less likely to cause no sabotage than (Fifth Column), and this is what creates the effect you are noting that some event A and its negation ~A shouldn’t both be evidence for the same thing.
Compactly, it’s only fair to claim that A and ~A cannot both be evidence for B in some very special situations. In general, though, A and ~A definitely can both serve as supporting evidence for B, it’s just that they will corroborate B to different degrees and this may or may not be further offset by the prior distributions of A and ~A.
But it is important not to assert the incorrect generalization that “It is impossible for A and ~A to both be evidence for B.”
Did you take the other replies to Tom McCabe’s comment, which raise the same question you do but offer the opposite answer, into consideration? The appeal to intuition that a fifth column might be refraining from sabotage in order to create more effective sabotage later does not let you take both A and ~A as evidence for B. Any way you verbally justify it, you will still be dutch-bookable and incoherent.
Without losing the generality of the theorems of probability, let me address your particular narrative: If you believe that, if a fifth column exists, it is of the type that will assuredly refrain from sabotage now in order to prepare a more devastating strike later; then observing sabotage (or no sabotage) cannot alter your probability that a fifth column exists.
This is a fancy way of saying that if you assume that the fifth column’s intent is totally independent of the observance of sabotage. P(A | B ) = P(A). That is, no evidence can update your position along the lines of Bayes’ theorem.
This is not what I am saying. I am saying that P(A |B) and P(A | ~B) can both be nonzero, and in the Bayesian sense this is what is meant by evidence. Either observing sabotage or failing to observe sabotage can, strictly speaking, corroborate the belief that there is a secret Fifth Column. If you make the further assumption that the actions of the Fifth Column are independent from your observations about sabotage, then yes, everything you said is correct.
My only point is that, in general, you cannot say that it is a rule of probability that A and ~A cannot both be evidence for B. You must be talking about specific assumptions involving independence for that to hold.
It also makes sense to think orthogonally about A and ~A in the following sense: if these are my only two hypotheses, then if there is any best decision, it is because under some decision rule, either A or ~A maximizes the a posteriori probability, but not both. If the posterior was equi-probable (50/50) for the hypotheses, then observing or not observing sabotage would change nothing. This could happen if you make the independence assumption above, but even if you don’t, it could still happen that the priors and conditional probabilities just work out to that particular case, and there would be no optimal belief in the Bayesian sense.
For a concrete example, suppose I flip a coin and if it is Heads, I will eat a tuna sandwich with probability 3⁄4 and a chicken sandwich with probability 1⁄4, and if it is Tails I will eat a turkey sandwich with probability 3⁄4 and a chicken sandwich with probability 1⁄4. Now suppose you only get to see what sandwich I select and then must make your best guess about what the coin showed. If I select a chicken sandwich, then you would believe that either Heads or Tails could serve as evidence for this decision. Neither result would be surprising to you (i.e., neither result would change your model) if you learned of it after I selected a chicken sandwich.
In this case, both A and ~A can serve as evidence for chicken, to the tune of 1⁄4 in each case. A is much stronger evidence for tuna, ~A is much stronger evidence for turkey, but both, to some extent, are evidence of chicken.
I’m not disagreeing with your claim about probability theory at all. I’m just saying that we don’t know that Warren made the assumption that his observations about sabotage were independent from the existence of a Fifth Column. For all we know, it was just that he had such a strong prior belief (which may or may not have been rational in itself) that there was a Fifth Column, that even after observing no sabotage, his decision rule was still in favor of belief in the Fifth Column.
It’s not that he mistakenly thought that the Fifth Column would definitely act in one way or the other. It’s just that both no sabotage and sabotage were, to some degree, compatible with his strong prior that there was a Fifth Column… enough so that after converting it to a posterior it didn’t cause him to change his position.
Uh..
A is evidence for B if P(B|A) > P(B). That is to say, learning A increases your belief in B. It is a fact from probability theory that P(B) = P(B|A)P(A) + P(B|¬A)P(¬A). If P(B|A) > P(B) and P(B|¬A) > P(B) then that says that:
P(B) > P(B)P(A) + P(B)P(¬A)
P(B) > P(B)(P(A) + P(¬A))
P(B) > P(B)
SInce A and ¬A are exhaustive and exclusive (so P(A) + P(¬A) = 1) this is a contradiction.
On the other hand, P(B|A) and P(B|¬A) being nonzero just means both A and ¬A are consistent with B—that is, A and ¬A are not disproofs of B.
You definitions do not match mine, which come from here :
The evidence for the hypothesis M is Pr(D | M), regardless of whether or not Pr(D) > Pr(D | M), at least according to that page and this statistics book sitting here at my desk (pages 184-186), and perhaps other sources.
If it’s just a war over definitions, then it’s not worth arguing. My point is that it’s misleading to act like that attribute you call ‘consistency’ doesn’t play a role in what could fuel reasoning like Warren’s above. It’s not about independence assumptions or mistakes about what can be evidence (do you really think Warren cared about the technical, Bayesian definition of evidence in his thinking?). It’s about understanding a person’s formation of prior probabilities in addition to the method by which they convert them to posteriors.
Ah!
You’ve used “evidence” to refer to the probability P(D | M). We’re talking about the colloquial use of “evidence for the hypothesis” meaning an observation that increases the probability of the hypothesis. This is the sense in which we’ve been using “evidence” in the OP.
If you draw 5 balls from an urn, and they’re all red, that’s evidence for the hypothesis that the next ball will be red, and so you conclude that the next one could be red, with a bit more certainty than you had before. If you draw 5 balls from an urn, and they’re blue, that’s evidence against the hypothesis that the next one will be red, so you conclude that the next one is less likely to be red than you thought before.
Your thought processes are wrong by the bayesian proof, however, if every sequence of 5 balls leads you to increase your belief that the next one will be red.
This is essentially what Warren did. If he observed sabotage he would have increased his belief in the existence of a fifth column, and yet, observing no sabotage he also increased his belief in the existence of a fifth column. Clearly, somewhere he’s made a mistake.
I see your point and I think we mostly agree about everything. My only slight extra point is to suggest that perhaps Warren was trying to use his prior beliefs to predict an explanation for absence of sabotage, rather than trying to use absence of sabotage to intensify his prior beliefs. In retrospect, it’s likely that you’re right about Warren and the quote makes it seem that he did, in fact, think that absence of sabotage increased likelihood of Fifth Column. But in general, though, I think a lot of people make a mistake that has more to do with starting out with an unreasonable prior, or making assumptions that their prior belief is independent of observations, than it has to do with a logical fallacy about letting conditioning on both A and ~A increase the probability of B.
The reply that it “is impossible for A and ~A to both be evidence for B” is to ignore what Frank said in favor of insisting on the very overgeneralization I think he was trying to point out. It’s not impossible at all when we are being imprecise enough about the prior expectations involved, such as when we lump all moments in a sustained effort together.
Here’s an example to illustrate what I’m saying: Say you are a parent of a 10 year old boy who generally wants to stay up past his bedtime. His protests vary from occasional temper tantrums to the usual slumped-shoulders expression of disappointment that bedtime has finally arrived. Under normal circumstances, the expectation is that he would give at least some evidence of wanting to stay up later. We’ll call this resistance “A,” and A is evidence for “B”: his desire and motivation to stay up later. What shall we say when ~A happens? That is, what shall we say when the boy one day suddenly goes enthusiastically to bed? That he has given up his desire? That it is impossible for this to be evidence of his continued desire and motivation? Of course not. It is exactly what we would expect a motivated and reasonably intelligent person to do: try different and probably more effective strategies. If we generalize the ongoing experience of the little boy’s quest to say up later, A and ~A are both evidence of B. “It is impossible for A and ~A to both be evidence for B” is simply not narrow enough to be a true statement, and using it in that way can easily amount to a bad counterargument.
Rather, we need to be specific about each situation. What I think we should pay attention to here is the prior expectation of B. With a high enough prior, A and ~A could either (but not both) be evidence of B. But if we are not being specific to each precise situation, the generalization “it is impossible for A and ~A to both be evidence of B” can be a very subtle straw man, because the person being argued against may not be relying on the assumption that A and ~A are equal evidence for B at the same time and in the same situation.
Returning to the Japanese Fifth Column argument, unlike the little boy in my example the Japanese (and, in general, descendants from countries that go to war with their current country of citizenship) do not have a consistent track record of wartime sabotage. Also, there isn’t any reason to think they would not generally be more loyal to their country of citizenship than the country of their parents, grandparents, or even of their own childhood. So there is no particularly strong expectation that they would commit sabotage… and thus no such expectation that some mysterious lack of sabotage is itself a sign of a new strategic attempt as part of a sustained effort. The argument should come down to the prior expectation of Japanese sabotage. That seems to be the crux of it to me.
It seems to me the weakness in Frank’s argument also lies in the basic premise that we should expect the Japanese to commit sabotage. And I believe the governor would need to rely on that premise, or a similar one, in order to sustain his argument beyond what Eliezer presented.
But. The inside information premise seems nearly undefeatable to me. We can’t comment on information we don’t have. I think that is always a possibility with controversial official responses that most people would prefer to deny. If the person whose claims you are evaluating has secret but pertinent information you don’t have access to, then it can be very difficult to offer a fair analysis. For one, you will have very different yet subjectively valid prior expectations.
You seem to be using “evidence of X” to mean something along the lines of “consistent with X”. That’s not what it means in this context.
An event is evidence for or against a scenario insofar as it changes your subjective probability estimate for that scenario. Your example child going enthusiastically to bed is in fact evidence that he’s changed his mind about staying up past his bedtime: it makes that scenario subjectively more plausible, even though it’s still probably a long-shot option given what you know. It might simultaneously be evidence for some new bedtime-avoidance scheme, but that’s entirely consistent with it also pointing to a possible change of heart: the increased probability of both is made up for in the reduced probability of him continuing with his old behavior.
Subjective probabilities for either/or scenarios have to sum to unity, and so evidence for one such option has to be balanced out by evidence against one or more of the others. A and ~A cannot both be evidence for a given scenario; at best they can both leave it unaffected.
I think I understand that a little better now. So thank you for taking the time to explain that to me.
Even so, it seems all I must do is add to my counterexample a prior track record of the little boy changing strategies while pretending to go along with authority. Reconsidering my little boy example with that in mind, does that change your reply?
Also, I fail to see how your response ameliorates my objection to the claim “it is impossible for A and ~A to both be evidence for B.” By your own explanation, they are both evidence, albeit offering unequal relative probabilities (forgive me if I’m getting the password wrong there, but I think you can surmise what it is I’m getting at). Maybe if we say that “It is impossible for A and ~A to both offer the same relative probability for B at the same time and concerning the same situation and given the same subjective view of the facts, etc,” we have something that doesn’t lead us to claim things that are not true about someone else’s argument, as in the case above, that their argument depends on A and ~A at the same time and in the same way, when the precise claim in question is actually that A can be evidence for B in one situation; and based upon the expectation set upon the observance of subsequent facts, at some later date, ~A could also end up being evidence for B. I’m not sure if I’ve explained that clearly, but I’ll keep trying until either I get what I’m missing, or I manage to express clearly what may well be coming out as gibberish. Either way, I get a little slice of the self-improvement I’m looking for.
Thanks again, and I hope you can forgive my wet ears on this and bear with me. The benefits of our exchanges here will probably be pretty one sided; I have almost nothing to offer a more experienced rationalist here, and lots to gain… and I realize that, so bear with me, and please know I am grateful for the feedback.
Here’s a contradiction with A and ~A both being evidence for the same thing. You could tell your spouse “Go up and check if little Timmy went to bed”. Before ze comes back you already have an estimate of how likely Timmy is to go to bed on time (your prior belief). But then your spouse, who was too tired to climb the stairs, comes back and tells you “Little Timmy may or may not have gone to bed”. Now, if both of those possibilities would be evidence of Timmy’s staying up late then you should update your belief accordingly. But how can you do that without receiving any new information?
Yes. I get that. We cannot use A and ~A to update our estimates in the same way at the same time. That’s not the same as saying that it is impossible for A and ~A to be evidence of the same thing. One could work on Tuesday, and the other could work on Friday, depending on the situation. That was my only point: can’t generalize a timeline but need to operate at specific points on that timeline. That goes back to the justification for interning Japanese citizens. If we say ~A just can’t ever be evidence of B because at some previous time A was evidence for B, then we are making a mistake. At some later date, ~A could end up being better evidence, depending on the situation. My point was that a better counterargument to the governor’s justification is to point out that the prospect of naturalized citizens turning against their home country in favor of their country of ancestry presents a very low prior, because the Japanese (and other groups that polyglot nations have gone to war with) have not usually behaved that way in the past. I could be wrong, but it doesn’t have anything to do with updating estimates with a variable and its negation to reach the same probability at the same time. I pretty much agree with what you said, just not the implication that it conflicts in some way with what I said.
I think the brief answer to this point is that it is very important to define the hypothesis precisely, to avoid being confused in the way you describe.
Applying that lesson to Earl Warren, we can say that he failed to distinguish between motive for big-sabotage and motive for little-sabotage. Lack of little-sabotage events is evidence in favor of motive-for-big-sabotage (and for no-motive-to-sabotage: with the benefit of hindsight, we know this was the true state of the world). But unclear phrasing by Warren made it sound like he believed absence of little-sabotage was evidence of motive-to-little-sabotage, which is a nonsensical position.
Given the low probability of big-sabotage (even after incorporating the evidence Warren puts forth), it’s pretty clear that the argument for Warren’s suggested policy (Japanese-American internment) depended pretty heavily on the confused thinking created by this equivocation.