It doesn’t have to be a Gaussian distribution. We would expect it to look like one under reasonably assumed conditions, but systematic bias would skew it. A particularly large single source (say there was a Battle of Dosworth Field that happened 400 years later) could easily result in a bimodal distribution.
In order for Wisdom of Crowds to work (as it’s expected to work), people aren’t guessing along a Gaussian distribution. They’re applying knowledge they have, and some of that knowledge is useful information, while some of that knowledge is noise. All the useful information pulls the mean towards the true value, while all the noise pulls it away. The difference is that the useful information converges on a single value, (because it’s a convergent problem with a single correct answer), while all the noise pulls arbitrarily in all directions.
Provided there isn’t some reason for the noise itself to converge on a single value (and I think this is where my previous comments have not necessarily been clear, I’m talking about the noise converging, not the overall mean), the noise should cancel itself out.
It should be obvious that if you give people a right answer and a wrong answer, the noise will be weighted in the direction of the wrong answer (because there’s no corresponding error on the other side of the true value). Even if you have two wrong answers on either side of a true value, and ask people to pick the one closest to the true value, you will still have a skew problem, because unless the two values are equidistant to the true value (which defeats the point of the question), your noise is not going to be equally distributed around the true value.
It doesn’t have to be a Gaussian distribution. We would expect it to look like one under reasonably assumed conditions, but systematic bias would skew it. A particularly large single source (say there was a Battle of Dosworth Field that happened 400 years later) could easily result in a bimodal distribution.
In order for Wisdom of Crowds to work (as it’s expected to work), people aren’t guessing along a Gaussian distribution. They’re applying knowledge they have, and some of that knowledge is useful information, while some of that knowledge is noise. All the useful information pulls the mean towards the true value, while all the noise pulls it away. The difference is that the useful information converges on a single value, (because it’s a convergent problem with a single correct answer), while all the noise pulls arbitrarily in all directions.
Provided there isn’t some reason for the noise itself to converge on a single value (and I think this is where my previous comments have not necessarily been clear, I’m talking about the noise converging, not the overall mean), the noise should cancel itself out.
It should be obvious that if you give people a right answer and a wrong answer, the noise will be weighted in the direction of the wrong answer (because there’s no corresponding error on the other side of the true value). Even if you have two wrong answers on either side of a true value, and ask people to pick the one closest to the true value, you will still have a skew problem, because unless the two values are equidistant to the true value (which defeats the point of the question), your noise is not going to be equally distributed around the true value.