Sorry, I thought it was better explained in my original post that I linked to, but infact that one just directed people to the poll which was self explanatory.
Each user was randomly given one of two sets of questions (randomly based on the parity of the minute that they took the test, hence that rather odd question). Each of the two sets had two questions, one of each ‘type’. One type was where the question was asked as an open-ended response where users typed in their answer. The other type was where the user was given two options and was asked to pick the one that they thought was closer to the correct answer.
So the first set (left in the chart) had two-choice telephone and open-answer population of Africa questions, while the other had two-choice Africa questions and open-answer telephone questions.
Anyway, I didn’t expect my experiment to generate much interest as I believe it to be systematically flawed, but rather the result that neither of the two open-ended questions produced very good estimates of the actual values. And further, just as Eliezer had suspected some years ago, I neglected publishing these results because they were not as interesting as a “Wow look at how wise the crowd is” result.
Non-conclusive and I believe the experiment to be flawed as the average probably depends more on the two options than on what people believe. It might be more interesting as a way to test for anchoring—or unanchoring through showing two different options. But in short, neither option did well but the two-choice option did better when responders had consistently missed the actual value (the telephone question) and so both of the two choices were more correct than what most people were giving.
Sorry, I thought it was better explained in my original post that I linked to, but infact that one just directed people to the poll which was self explanatory.
Each user was randomly given one of two sets of questions (randomly based on the parity of the minute that they took the test, hence that rather odd question). Each of the two sets had two questions, one of each ‘type’. One type was where the question was asked as an open-ended response where users typed in their answer. The other type was where the user was given two options and was asked to pick the one that they thought was closer to the correct answer.
So the first set (left in the chart) had two-choice telephone and open-answer population of Africa questions, while the other had two-choice Africa questions and open-answer telephone questions.
Anyway, I didn’t expect my experiment to generate much interest as I believe it to be systematically flawed, but rather the result that neither of the two open-ended questions produced very good estimates of the actual values. And further, just as Eliezer had suspected some years ago, I neglected publishing these results because they were not as interesting as a “Wow look at how wise the crowd is” result.
Could you summarize any differences in performance between the two question formats?
Non-conclusive and I believe the experiment to be flawed as the average probably depends more on the two options than on what people believe. It might be more interesting as a way to test for anchoring—or unanchoring through showing two different options. But in short, neither option did well but the two-choice option did better when responders had consistently missed the actual value (the telephone question) and so both of the two choices were more correct than what most people were giving.