The standard one goes something like, “The dangerous disease itchyballitis has a frequency of 1% in the general population of men. The test for the disease has an accuracy of 95% (for both false positives and false negatives). A randomly selected dude gets tested and the result is positive. What’s the probability he has the disease?”
But most people get that wrong. A correct answer is more likely when the problem is phrased in equivalent but more concrete terms as follows: “The dangerous disease itchyballitis affects 100 out of 10,000 men. The test for the disease gives the correct answer 95 times out of 100. A randomly selected dude gets tested and the result is positive. What’s the chance he has the disease?”
Or, for the approximate answer, just compare the base rate with the false positive rate (multiplying by .9something has small impacts that mostly cancel out). About 1% of people test positive due to having the disease (a bit less, actually), about 5% of people test positive because of an inaccurate test (a bit less, actually), so a person with a positive test has about a 1 in 6 chance of having the disease.
It’s more intuitive to use odds. Prior odds are 1:99, likelihood ratio (strength of evidence) given by a positive test is 95:5, so posterior odds are (1:99)*(95:5)=19:99, or probability of 19⁄118 (about 16%).
The standard one goes something like, “The dangerous disease itchyballitis has a frequency of 1% in the general population of men. The test for the disease has an accuracy of 95% (for both false positives and false negatives). A randomly selected dude gets tested and the result is positive. What’s the probability he has the disease?”
But most people get that wrong. A correct answer is more likely when the problem is phrased in equivalent but more concrete terms as follows: “The dangerous disease itchyballitis affects 100 out of 10,000 men. The test for the disease gives the correct answer 95 times out of 100. A randomly selected dude gets tested and the result is positive. What’s the chance he has the disease?”
Or, for the approximate answer, just compare the base rate with the false positive rate (multiplying by .9something has small impacts that mostly cancel out). About 1% of people test positive due to having the disease (a bit less, actually), about 5% of people test positive because of an inaccurate test (a bit less, actually), so a person with a positive test has about a 1 in 6 chance of having the disease.
p(test_positive|itchyballitis) = 0.95
p(test_positive|!itchyballitis) = 0.05
p(itchyballitis) = 0.01
p(test_positive) = p(test_positive|itchyballitis) * p(itchyballitis) + p(test_positive|!itchyballitis) * p(!itchiballitis)
= 0.95 * 0.01 + 0.05 * 0.99
= 0.059
p(itchyballitis|test_positive) = (p(test_positive|itchyballitis) * p(itchyballitis)) / p(test_positive)
= (0.95 * 0.01) / 0.059
= 0.161
Edit: If anyone else is thinking of writing math stuff in a comment, don’t do what I did. Read http://wiki.lesswrong.com/wiki/Comment_formatting first! Also, thanks Vladimir_Nesov.
It’s more intuitive to use odds. Prior odds are 1:99, likelihood ratio (strength of evidence) given by a positive test is 95:5, so posterior odds are (1:99)*(95:5)=19:99, or probability of 19⁄118 (about 16%).
Use backslash before stars \* to have them in the comment * without turning text into italics.