I really have no idea what went so wrong [with the question about Bayes’ birth year]
Note also that in the last two surveys the mean and median answers were approximately correct, whereas this time even the first quartile answer was too late by almost a decade. So it’s not just a matter of overconfidence—there also was a systematic error. Note that Essay Towards Solving a Problem in the Doctrine of Chances was published posthumously when Bayes would have been 62; if people estimated the year it was published and assumed that he had been approximately in his thirties (as I did), that would explain half of the systematic bias.
To expand on this: Confidence intervals that are accurate for multiple judgements by the same person may be accurate for the same judgement made by multiple people. Normally, we can group everyone’s responses and measure how many people were actually right when they said they were 70% sure. This should average out to 70% is because the error is caused by independent variations in each person’s estimate. If there’s a systematic error, then even if we all accounted for the systematic error in our confidence levels, we would all still fail at the same time if there was an error.
I had a vaguely right idea for the year of publication, and didn’t know it was posthumous, but assumed that it was published in his middle-to-old age and so got the question right.
This question was biased against people who don’t believe in history.
For my answer, which was wildly wrong, I guesstimated by interpolating backward using the rate of technological and cultural advance in various cultures throughout my lifetime, the dependency of such advances on Bayesian and related logics, with an adjustment for known wars and persecution of scientists and an assumption that Bayes lived in the western world. I should have realized that my confidence on estimates of each of these (except the last) was not very good and that I really shouldn’t have had any more than marginal confidence in my answer, but I was hoping that the sheer number of assumptions I made would approach statistical mean of my confidences and that the overestimates would counterbalance the underestimates.
The real lesson I learned from this exercise is that I shouldn’t have such high confidence in my ability to produce and compound a statistically significant number of assumptions with associated confidence levels.
Have you read Malcolm Gladwell—Blink? It’s a fun book that doesn’t take too long, which hella makes up for the occasional failure of rigor. Anyhow, the conclusion is that even on hard problems, expert-trusted models will still have very few parameters. And those parameters don’t have to be the same things you’d use if you were a perfect reasoner—what’s important is that you can use it as an indicator.
I personally had error bars of 75 years on my confidence and was 74 years off. I’m not sure if I translated that correctly into percent certainty of being within 20 years of correct, but I felt okay about the result. This might be another source of systematic error?
Note also that in the last two surveys the mean and median answers were approximately correct, whereas this time even the first quartile answer was too late by almost a decade. So it’s not just a matter of overconfidence—there also was a systematic error. Note that Essay Towards Solving a Problem in the Doctrine of Chances was published posthumously when Bayes would have been 62; if people estimated the year it was published and assumed that he had been approximately in his thirties (as I did), that would explain half of the systematic bias.
To expand on this: Confidence intervals that are accurate for multiple judgements by the same person may be accurate for the same judgement made by multiple people. Normally, we can group everyone’s responses and measure how many people were actually right when they said they were 70% sure. This should average out to 70% is because the error is caused by independent variations in each person’s estimate. If there’s a systematic error, then even if we all accounted for the systematic error in our confidence levels, we would all still fail at the same time if there was an error.
I had a vaguely right idea for the year of publication, and didn’t know it was posthumous, but assumed that it was published in his middle-to-old age and so got the question right.
This question was biased against people who don’t believe in history.
For my answer, which was wildly wrong, I guesstimated by interpolating backward using the rate of technological and cultural advance in various cultures throughout my lifetime, the dependency of such advances on Bayesian and related logics, with an adjustment for known wars and persecution of scientists and an assumption that Bayes lived in the western world. I should have realized that my confidence on estimates of each of these (except the last) was not very good and that I really shouldn’t have had any more than marginal confidence in my answer, but I was hoping that the sheer number of assumptions I made would approach statistical mean of my confidences and that the overestimates would counterbalance the underestimates.
The real lesson I learned from this exercise is that I shouldn’t have such high confidence in my ability to produce and compound a statistically significant number of assumptions with associated confidence levels.
Have you read Malcolm Gladwell—Blink? It’s a fun book that doesn’t take too long, which hella makes up for the occasional failure of rigor. Anyhow, the conclusion is that even on hard problems, expert-trusted models will still have very few parameters. And those parameters don’t have to be the same things you’d use if you were a perfect reasoner—what’s important is that you can use it as an indicator.
I personally had error bars of 75 years on my confidence and was 74 years off. I’m not sure if I translated that correctly into percent certainty of being within 20 years of correct, but I felt okay about the result. This might be another source of systematic error?