So the better a woman does, the less you believe she can actually do it.
Yes; “extraordinary claims require extraordinary evidence, but ordinary claims require only ordinary evidence.” If a random person tells me that they are a Rhodes Scholar and a certified genius, I will be more skeptical than if they told me they merely went to Harvard, and more skeptical of that than if they told me they went to community college. And at some level of ‘better’ I will stop believing them entirely.
At what point do you update your prior about what women can do?
To go back to the multilevel model framework: a single high data point/group will be pulled back down to the mean of the population data points/group (how much will depend on the quality of the test), while the combined mean will slightly increase.
However, this increase may be extremely small, as makes sense. If you know from the official SAT statistics that 3 million women took the SAT last year and scored an average of 1200 (or whatever a medium score looks like these days, they keep changing the test), then that’s an extremely informative number which will be hard to change since you already know of how millions of women have done in the past: so whatever you learn from a single random woman scoring 800 this year will be diluted like 1 in 3 million...
Yes; “extraordinary claims require extraordinary evidence, but ordinary claims require only ordinary evidence.” If a random person tells me that they are a Rhodes Scholar and a certified genius, I will be more skeptical than if they told me they merely went to Harvard, and more skeptical of that than if they told me they went to community college. And at some level of ‘better’ I will stop believing them entirely.
To go back to the multilevel model framework: a single high data point/group will be pulled back down to the mean of the population data points/group (how much will depend on the quality of the test), while the combined mean will slightly increase.
However, this increase may be extremely small, as makes sense. If you know from the official SAT statistics that 3 million women took the SAT last year and scored an average of 1200 (or whatever a medium score looks like these days, they keep changing the test), then that’s an extremely informative number which will be hard to change since you already know of how millions of women have done in the past: so whatever you learn from a single random woman scoring 800 this year will be diluted like 1 in 3 million...
Nifty: I’ve found an explanation of Stein’s paradox, and it turns out to be basically shrinkage!
Ahh… “Expect regression to the mean ”.