They were given the information that the test has a sensitivity of 90% (10% false negative rate), a specificity of 91% (9% false positive rate), and that the base rate of cancer for the patient’s age and sex is 1%. Famously, nearly half of doctors incorrectly answered that the patient had a 90% probability of having cancer. [1] The actual probability is only 9%
The probability surely isn’t 90%, but if the scenario presented to the doctors was anything other than “routine cancer screening that we do for everybody who comes in here”, the probability isn’t 9% either.
Most people are tested for cancer because they have one or more symptoms consistent with cancer. So the base rate of 1% “for the patient’s age and sex” isn’t the correct prior, because most of the people in the base rate have no symptoms that would provoke a test. The correct prior would be adjusted for the patient’s symptoms. But how do we actually adjust the prior for symptoms? I don’t know. It sounds difficult.
But I expect that usually a test has been used in the past in exactly this way: restricted to those with symptoms. So as long as someone is in charge of gathering data on this, we should already have an empirical prior for P(has cancer | positive cancer test result + symptoms), e.g. 30%, that doctors can use directly. This information should be available because, as time passes after a test, it should eventually become clear whether the patient really had cancer or not, and that later information (aggregated over numerous patients) gives us a pretty good estimate of P(has cancer | positive + symptoms) and P(has cancer | negative + symptoms) for new patients. (But I am not a doctor and can’t vouch for whether The System is bothering to calculate these things.)
It’s good to see, then, that there are separate measures of sensitivity and specificity for symptomatic and asymptomatic patients.
This post doesn’t tell me what I want to know, though, which is:
P(I am infected | positive test & symptoms)
P(I am infected | positive test & no symptoms)
P(I am infected | negative test & symptoms)
P(I am infected | negative test & no symptoms)
So, if you got a negative result, you can lower your estimated odds that you have COVID to 0.4x what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 145x what they were before.
My prior would just be a guess, and I don’t see how multiplying a guess by 145x is helpful. We really need a computation whose result is a probability.
Most people are tested for cancer because they have one or more symptoms consistent with cancer. So the base rate of 1% “for the patient’s age and sex” isn’t the correct prior, because most of the people in the base rate have no symptoms that would provoke a test.
To clarify, the problem that Gigerenzer posed to doctors began with “A 50-year-old woman, no symptoms, participates in a routine mammography screening”. You’re right that if there were symptoms or other reasons to suspect having cancer, that should be factored into the prior. (And routine mammograms are in fact recommended to all women of a certain age in the US.)
We really need a computation whose result is a probability.
I agree—it would be ideal to have a way to precisely calculate your prior odds of having COVID. I try and estimate this using microCOVID to sum my risk based on my recent exposure level, the prevalence in my area, and my vaccination status. I don’t know a good way to estimate my prior if I do have symptoms.
My prior would just be a guess, and I don’t see how multiplying a guess by 145x is helpful.
I don’t fully agree with this part, because regardless of whether my prior is a guess or not, I still need to make real-world decisions about when to self-isolate and when to seek medical treatment. If I have a very mild sore throat that might just be allergies, and I stayed home all week, and I test negative on a rapid test, what should I do? What if I test negative on a PCR test three days later? Regardless of whether I’m using Bayes factors, or test sensitivity or just my intuition, I’m still using something to determine at which point it’s safe to go out again. Knowing the Bayes factors for the tests I’ve taken helps that reasoning be slightly more grounded in reality.
Edit: I’ve updated my post to make it clearer that the Gigerenzer problem specified that the test was a routine test on an asymptomatic patient.
The probability surely isn’t 90%, but if the scenario presented to the doctors was anything other than “routine cancer screening that we do for everybody who comes in here”, the probability isn’t 9% either.
Most people are tested for cancer because they have one or more symptoms consistent with cancer. So the base rate of 1% “for the patient’s age and sex” isn’t the correct prior, because most of the people in the base rate have no symptoms that would provoke a test. The correct prior would be adjusted for the patient’s symptoms. But how do we actually adjust the prior for symptoms? I don’t know. It sounds difficult.
But I expect that usually a test has been used in the past in exactly this way: restricted to those with symptoms. So as long as someone is in charge of gathering data on this, we should already have an empirical prior for P(has cancer | positive cancer test result + symptoms), e.g. 30%, that doctors can use directly. This information should be available because, as time passes after a test, it should eventually become clear whether the patient really had cancer or not, and that later information (aggregated over numerous patients) gives us a pretty good estimate of P(has cancer | positive + symptoms) and P(has cancer | negative + symptoms) for new patients. (But I am not a doctor and can’t vouch for whether The System is bothering to calculate these things.)
It’s good to see, then, that there are separate measures of sensitivity and specificity for symptomatic and asymptomatic patients.
This post doesn’t tell me what I want to know, though, which is:
P(I am infected | positive test & symptoms)
P(I am infected | positive test & no symptoms)
P(I am infected | negative test & symptoms)
P(I am infected | negative test & no symptoms)
My prior would just be a guess, and I don’t see how multiplying a guess by 145x is helpful. We really need a computation whose result is a probability.
To clarify, the problem that Gigerenzer posed to doctors began with “A 50-year-old woman, no symptoms, participates in a routine mammography screening”. You’re right that if there were symptoms or other reasons to suspect having cancer, that should be factored into the prior. (And routine mammograms are in fact recommended to all women of a certain age in the US.)
I agree—it would be ideal to have a way to precisely calculate your prior odds of having COVID. I try and estimate this using microCOVID to sum my risk based on my recent exposure level, the prevalence in my area, and my vaccination status. I don’t know a good way to estimate my prior if I do have symptoms.
I don’t fully agree with this part, because regardless of whether my prior is a guess or not, I still need to make real-world decisions about when to self-isolate and when to seek medical treatment. If I have a very mild sore throat that might just be allergies, and I stayed home all week, and I test negative on a rapid test, what should I do? What if I test negative on a PCR test three days later? Regardless of whether I’m using Bayes factors, or test sensitivity or just my intuition, I’m still using something to determine at which point it’s safe to go out again. Knowing the Bayes factors for the tests I’ve taken helps that reasoning be slightly more grounded in reality.
Edit: I’ve updated my post to make it clearer that the Gigerenzer problem specified that the test was a routine test on an asymptomatic patient.