That’s kind of Eliezer’s point when he talks about how astounding it is that human beings are unbiased estimators of beans in a jar. I’d agree that it’s astounding, but there are plenty of other statistical phenomena that astound me equally, so I’ve learned to not treat my level of astonishment as a precision tool for judging incredibility.
To some extent, I suspect the mechanism of estimation plays a significant role. I doubt very much that human beings have built-in heuristics for appraising large numbers of objects. Arithmetic is a fairly novel concept, evolutionarily speaking, and some cultures don’t even have the natural numbers.
So when we try and guess the number of beans in a jar, there’s presumably no single go-to mechanism we’re using to come up with that value. It will be some sort of aggregate of sources, such as our past experience of beans in jars, visualisations of what 200 or 400 or 600 beans all in one place might look like, or rough guesses of volume and packing density. It isn’t even necessarily a transparent process. If you try and make a rough estimate of something, aren’t you using some sort of basis for that? It’s not like the number just pops into your head. You wrestle with it for a little while.
Individual components of that estimation may be subject to bias in a given direction, but over enough sources, over enough people with many different estimation criteria, I wouldn’t trust there to necessarily be a demonstrable bias over repeated experiments without deliberate intervention on the part of the experimenter, such as using a container of an unusual shape that would result in a known overestimation of its volume.
Edit: I should also add an expectation of bias idiosyncratic to specific questions. For example, I think it was Yvain’s most recent LW membership poll that asked for the date Newton published his Philosophiæ Naturalis Principia Mathematica. If there was a widely-believed false date for this event, that would be an obvious source of noise that wouldn’t be cancelled out by corresponding noise on the other side of the true value.
According to a study cited in the Model Thinking class from Coursera.org, this is correct. Crowds which can be collectively characterized as a hedgehog do not have wisdom; crowds which are collectively foxes do have wisdom. The diversity of models is key.
Individual components of that estimation may be subject to bias in a given direction, but over enough sources, over enough people with many different estimation criteria, I wouldn’t trust there to necessarily be a demonstrable bias over repeated experiments without deliberate intervention on the part of the experimenter
This can be seen simply as a version of the central limit theorem: Any sum or average of samples from ANY distribution (with finite mean and standard deviation) will be approximately normally distributed (Gaussian) with the approximation better for larger samples. Neato!
I’d say it’s related to the central limit theorem, but would be cautious about equating the two. We would probably expect a Gaussian distribution from a variable which is the sum or product of a lot of component parts (i.e. lots of different estimator methods), but we wouldn’t necessarily expect the mean to coincide with the true value unless some of those estimator methods were reliable, and they didn’t collectively skew the distribution in one direction.
(and nit-picking, it’s “a well-defined population mean and population standard deviation”, which is required for defining the distribution. If you can’t trust your sample mean and sample SD to approximate your population mean and SD, it’s no longer reliable, and you’d have to use something else, like a t-distribution)
Is there a short explanation of why we should expect an absence of systematic bias?
That’s kind of Eliezer’s point when he talks about how astounding it is that human beings are unbiased estimators of beans in a jar. I’d agree that it’s astounding, but there are plenty of other statistical phenomena that astound me equally, so I’ve learned to not treat my level of astonishment as a precision tool for judging incredibility.
To some extent, I suspect the mechanism of estimation plays a significant role. I doubt very much that human beings have built-in heuristics for appraising large numbers of objects. Arithmetic is a fairly novel concept, evolutionarily speaking, and some cultures don’t even have the natural numbers.
So when we try and guess the number of beans in a jar, there’s presumably no single go-to mechanism we’re using to come up with that value. It will be some sort of aggregate of sources, such as our past experience of beans in jars, visualisations of what 200 or 400 or 600 beans all in one place might look like, or rough guesses of volume and packing density. It isn’t even necessarily a transparent process. If you try and make a rough estimate of something, aren’t you using some sort of basis for that? It’s not like the number just pops into your head. You wrestle with it for a little while.
Individual components of that estimation may be subject to bias in a given direction, but over enough sources, over enough people with many different estimation criteria, I wouldn’t trust there to necessarily be a demonstrable bias over repeated experiments without deliberate intervention on the part of the experimenter, such as using a container of an unusual shape that would result in a known overestimation of its volume.
Edit: I should also add an expectation of bias idiosyncratic to specific questions. For example, I think it was Yvain’s most recent LW membership poll that asked for the date Newton published his Philosophiæ Naturalis Principia Mathematica. If there was a widely-believed false date for this event, that would be an obvious source of noise that wouldn’t be cancelled out by corresponding noise on the other side of the true value.
According to a study cited in the Model Thinking class from Coursera.org, this is correct. Crowds which can be collectively characterized as a hedgehog do not have wisdom; crowds which are collectively foxes do have wisdom. The diversity of models is key.
This can be seen simply as a version of the central limit theorem: Any sum or average of samples from ANY distribution (with finite mean and standard deviation) will be approximately normally distributed (Gaussian) with the approximation better for larger samples. Neato!
I’d say it’s related to the central limit theorem, but would be cautious about equating the two. We would probably expect a Gaussian distribution from a variable which is the sum or product of a lot of component parts (i.e. lots of different estimator methods), but we wouldn’t necessarily expect the mean to coincide with the true value unless some of those estimator methods were reliable, and they didn’t collectively skew the distribution in one direction.
(and nit-picking, it’s “a well-defined population mean and population standard deviation”, which is required for defining the distribution. If you can’t trust your sample mean and sample SD to approximate your population mean and SD, it’s no longer reliable, and you’d have to use something else, like a t-distribution)