Since I know those people, I would weight their answers according to my best estimate of their skill at such tasks, and then average the whole group, including me.
Since I know those people, I would weight their answers according to my best estimate of their skill at such tasks, and then average the whole group, including me.
Doing this correctly can get pretty complicated. Basically, the more people you have, the less you should weight the low-quality estimates compared to the high-quality estimates.
For example, suppose that “good” thermometers are unbiased and “bad” thermometers are all biased in the same direction, but you don’t know which direction.
If you have one thermometer which you know is good, and one which you’re 95% sure is good, then you should weight both measurements about the same.
But if you have 10^6 thermometers which you know are good, and 10^6 which you’re 95% sure are good, then you should pretty much ignore the possibly-bad ones.
Since I know those people, I would weight their answers according to my best estimate of their skill at such tasks, and then average the whole group, including me.
Doing this correctly can get pretty complicated. Basically, the more people you have, the less you should weight the low-quality estimates compared to the high-quality estimates.
For example, suppose that “good” thermometers are unbiased and “bad” thermometers are all biased in the same direction, but you don’t know which direction.
If you have one thermometer which you know is good, and one which you’re 95% sure is good, then you should weight both measurements about the same.
But if you have 10^6 thermometers which you know are good, and 10^6 which you’re 95% sure are good, then you should pretty much ignore the possibly-bad ones.
Not that it matters tremendously, but I was thinking of the jelly bean problem.
What kind of weighted average?
My math isn’t good enough to formalize it—I’d do it by feel.
Drat—likewise.