Thanks.
-ETA
I followed both the link and the links to several of Wikipedia’s sources, but no further. The stuff I saw all seems to support Rolf’s claims about S. L. A Marshall being unreliable and the primary source for most of the claims of the killing is hard side.
Fighter pilot victories in clear-air combat are rare; it follows that they are Poisson-distributed, and that you would expect to have a few extreme outliers and a great mass of apparent “non-killers” even if every pilot was doing his genuine best to kill. That is even before taking into account pilot skill, which for all we know has a very wide range.
Fighter pilot victories in clear-air combat are rare; it follows that they are Poisson-distributed, and that you would expect to have a few extreme outliers and a great mass of apparent “non-killers”
I don’t see how that follows at all. You don’t know it was a Poisson distribution (there are lots of distributions natural phenomena follow; the negative binomial and lognormal also pop up a lot in human contexts), and even if you did, you don’t know the the relevant rate parameter lambda to know how many pilots should be expected to have 1 success, and since you’re making purely a priori arguments here rather than observing that the studies have specific flaws (eg perhaps they included pilots who never saw combat), it’s clear you’re trying to make a fully general counterargument to explain away any result those studies could have reached without knowing anything about them. (‘Oh, only .001% of pilots killed anyone? That darn Poisson!’)
You don’t know it was a Poisson distribution (there are lots of distributions natural phenomena follow; the negative binomial and lognormal also pop up a lot in human contexts),
The Poisson distribution is the distribution that models rare independent events. Given how involved you are with prediction and statistics, I’d expect you to know that.
The Poisson distribution is the distribution that models rare independent events.
Are number of fighter pilot victories, clearly, a priori, going to be independent events? That a pilot shooting down one plane is entirely independent of whether they go on to shoot down another plane? (Think about the other two distributions I mentioned and why they might be better matches...)
Distributions are model assumptions, to be checked like any other. In fact, often they are the most important and questionable assumption made in a model, which determines the conclusion; a LW example of this is Karnofsky’s statistical argument against funding existential risk, which driven entirely by the chosen distribution. As the quote goes: ‘they strain at the gnat of the prior who swallow the camel of the likelihood function’.
I personally find choice of distribution to be dangerous, which is why (when not too much more work) in my own analyses I try to use nonparametric methods: Mann-Whitney u-tests rather than t-tests, bootstraps, and at least look at graphs of histograms or residuals while I’m doing my main analysis. Distributions are not always as one expects. To give an example involving the Poisson: I was doing a little Hacker News voting experiment. One might think that a Poisson would be a perfect fit for distribution of scores—lots of voters, each one only votes on a few links out of the thousands submitted each day, they’re different voters, and votes are positive count data. One would be wrong, since while a Poisson fits better than, say, a normal, it’s grossly wrong about outliers; what actually fits much better is a mixture distribution of at least 3 sub-distributions of Poissons and possibly normals or others. (My best guess is that this mixture distribution is caused by HN’s segmented site design leading to odd dynamics in voting: the first distribution corresponds to low-scoring submissions which spend all their time on /newest, and the rest to various subpopulations of submissions which make it to the main page—although I’m not sure why there are more than 1 of those).
So no, I hope it is because of, rather than despite, my involvement with stats that I object to Rolf’s casual assumption of a particular distribution to create a fully general counterargument to explain away data he has not seen but dislikes.
In particular notice that any deviations from Poisson are going to be in the direction that makes Rolf’s argument even stronger.
No, they’re not, not without even more baseless assumptions. The Poisson is not well-justified, and it’s not even conservative for Rolf’s argument. If there was a selection process in which the best pilots get to combat the most (a shocking proposition, I realize); then many more would cross the threshold of at least 1 kill than would be predicted if one incorrectly modeled kill rates as Poissons with averages. This is the sort of thing (multiple consecutive factors) which would generate other possible distributions like the lognormal, which appear all the time in human performance data like scientific publications. (‘...who swallow the camel of the likelihood function’.)
And this still doesn’t address my point that you cannot write off data you have not seen with a fully general counterargument—without very good reasons which Rolf has not done anything remotely like showing. You do not know whether that extremely low quoted rate is exactly what one would expect from pilots doing their level best to kill others without doing a lot more work to verify that a Poisson fits, what the rate parameter is, and what the distribution of pilot differences looks like; the final kill rate of pilots, just like soldiers, is the joint result of many things.
Can you please link what you’re quoting from.
Here.
Thanks. -ETA I followed both the link and the links to several of Wikipedia’s sources, but no further. The stuff I saw all seems to support Rolf’s claims about S. L. A Marshall being unreliable and the primary source for most of the claims of the killing is hard side.
Isegoria claims Grossman’s claims, if not Marshall’s, is better supported by things like fighter pilot studies: http://westhunt.wordpress.com/2014/12/28/shoot-to-kill/#comment-64665
Fighter pilot victories in clear-air combat are rare; it follows that they are Poisson-distributed, and that you would expect to have a few extreme outliers and a great mass of apparent “non-killers” even if every pilot was doing his genuine best to kill. That is even before taking into account pilot skill, which for all we know has a very wide range.
I don’t see how that follows at all. You don’t know it was a Poisson distribution (there are lots of distributions natural phenomena follow; the negative binomial and lognormal also pop up a lot in human contexts), and even if you did, you don’t know the the relevant rate parameter lambda to know how many pilots should be expected to have 1 success, and since you’re making purely a priori arguments here rather than observing that the studies have specific flaws (eg perhaps they included pilots who never saw combat), it’s clear you’re trying to make a fully general counterargument to explain away any result those studies could have reached without knowing anything about them. (‘Oh, only .001% of pilots killed anyone? That darn Poisson!’)
The Poisson distribution is the distribution that models rare independent events. Given how involved you are with prediction and statistics, I’d expect you to know that.
Are number of fighter pilot victories, clearly, a priori, going to be independent events? That a pilot shooting down one plane is entirely independent of whether they go on to shoot down another plane? (Think about the other two distributions I mentioned and why they might be better matches...)
Distributions are model assumptions, to be checked like any other. In fact, often they are the most important and questionable assumption made in a model, which determines the conclusion; a LW example of this is Karnofsky’s statistical argument against funding existential risk, which driven entirely by the chosen distribution. As the quote goes: ‘they strain at the gnat of the prior who swallow the camel of the likelihood function’.
I personally find choice of distribution to be dangerous, which is why (when not too much more work) in my own analyses I try to use nonparametric methods: Mann-Whitney u-tests rather than t-tests, bootstraps, and at least look at graphs of histograms or residuals while I’m doing my main analysis. Distributions are not always as one expects. To give an example involving the Poisson: I was doing a little Hacker News voting experiment. One might think that a Poisson would be a perfect fit for distribution of scores—lots of voters, each one only votes on a few links out of the thousands submitted each day, they’re different voters, and votes are positive count data. One would be wrong, since while a Poisson fits better than, say, a normal, it’s grossly wrong about outliers; what actually fits much better is a mixture distribution of at least 3 sub-distributions of Poissons and possibly normals or others. (My best guess is that this mixture distribution is caused by HN’s segmented site design leading to odd dynamics in voting: the first distribution corresponds to low-scoring submissions which spend all their time on /newest, and the rest to various subpopulations of submissions which make it to the main page—although I’m not sure why there are more than 1 of those).
So no, I hope it is because of, rather than despite, my involvement with stats that I object to Rolf’s casual assumption of a particular distribution to create a fully general counterargument to explain away data he has not seen but dislikes.
Rolf addressed that point:
In particular notice that any deviations from Poisson are going to be in the direction that makes Rolf’s argument even stronger.
No, they’re not, not without even more baseless assumptions. The Poisson is not well-justified, and it’s not even conservative for Rolf’s argument. If there was a selection process in which the best pilots get to combat the most (a shocking proposition, I realize); then many more would cross the threshold of at least 1 kill than would be predicted if one incorrectly modeled kill rates as Poissons with averages. This is the sort of thing (multiple consecutive factors) which would generate other possible distributions like the lognormal, which appear all the time in human performance data like scientific publications. (‘...who swallow the camel of the likelihood function’.)
And this still doesn’t address my point that you cannot write off data you have not seen with a fully general counterargument—without very good reasons which Rolf has not done anything remotely like showing. You do not know whether that extremely low quoted rate is exactly what one would expect from pilots doing their level best to kill others without doing a lot more work to verify that a Poisson fits, what the rate parameter is, and what the distribution of pilot differences looks like; the final kill rate of pilots, just like soldiers, is the joint result of many things.