(for the record, Elo tried really hard to get me involved and I procrastinated helping and forgot about it. I 100% endorse this.)
My only suggestion is to create a margin of error on the calibration questions, eg “How big is the soccer ball, to within 10 cm?”. Otherwise people are guessing whether they got the exact centimeter right, which is pretty hard.
I actually can’t do that. The way our survey engine works changing the question answers mid-survey would require taking it down for maintenance and hand-joining the current respondents to the new respondents. In general I planned to handle the “within 10 cm” thing during analysis. Try to fermi estimate the value and give your closest answer, then the probability you got it right. We can look at how close your confidence was to a sane range of values for the answer.
I.E, if you got it within ten and said you had a ten percent chance of getting it right you’re well calibrated.
Note: I am not entirely sure this is sane, and would like feedback on better ways to do it.
EDIT: I should probably be very precise here. I cannot change the question answers in the software, presumably because it would involve changing the underlying table schema for the database. I can change the question/ question descriptions so if there’s a superior process for answering these I could describe it there.
“In general I planned to handle the “within 10 cm” thing during analysis. Try to fermi estimate the value and give your closest answer, then the probability you got it right. We can look at how close your confidence was to a sane range of values for the answer.”
But unless I’m misunderstanding you, the size of the unspoken “sane range” is the entire determinant of how you should calibrate yourself.
Suppose you ask me when Genghis Khan was born, and all I know is “sometime between 1100 and 1200, with certainty”. Suppose I choose 1150. If you require the exact year, then I’m only right if it was exactly 1150, and since it could be any of 100 years my probability is 1%. If you require within five years, then I’m right if it was any time between 1145 and 1155, so my probability is 10%. If you require within fifty years, then my probability is effectively 100%. All of those are potential “sane ranges”, but depending on which one you the correctly calibrated estimate could be anywhere from 1% to 100%.
Unless I am very confused, you might want to change the questions and hand-throw-out all the answers you received before now, since I don’t think they’re meaningful (except if interpreted as probability of being exactly right).
(Actually, it might be interesting to see how many people figure this out, in a train wreck sort of way.)
PS: I admit this is totally 100% my fault for not getting around to looking at it the five times you asked me to before this.
Currently trying to figure out how to do that in the least intrusive way.
EDIT: Good news it turns out that I can edit the calibration question ‘answers’ after all. The ones where a range would make sense have been edited to include one. Questions such as “which is heavier” have not been because the ignorance prior should be fairly obvious.
Fri Mar 25 19:50:41 PDT 2016 | Answers on or before this date where the ranges have been added will be controlled for at analysis time.
Elo, thanks a lot for doing this.
(for the record, Elo tried really hard to get me involved and I procrastinated helping and forgot about it. I 100% endorse this.)
My only suggestion is to create a margin of error on the calibration questions, eg “How big is the soccer ball, to within 10 cm?”. Otherwise people are guessing whether they got the exact centimeter right, which is pretty hard.
Since you are such a huge part of the diaspora community I would be delighted if you could share the survey to both your readers and your friends.
We will get that suggestion sorted asap.
I actually can’t do that. The way our survey engine works changing the question answers mid-survey would require taking it down for maintenance and hand-joining the current respondents to the new respondents. In general I planned to handle the “within 10 cm” thing during analysis. Try to fermi estimate the value and give your closest answer, then the probability you got it right. We can look at how close your confidence was to a sane range of values for the answer.
I.E, if you got it within ten and said you had a ten percent chance of getting it right you’re well calibrated.
Note: I am not entirely sure this is sane, and would like feedback on better ways to do it.
EDIT: I should probably be very precise here. I cannot change the question answers in the software, presumably because it would involve changing the underlying table schema for the database. I can change the question/ question descriptions so if there’s a superior process for answering these I could describe it there.
But unless I’m misunderstanding you, the size of the unspoken “sane range” is the entire determinant of how you should calibrate yourself.
Suppose you ask me when Genghis Khan was born, and all I know is “sometime between 1100 and 1200, with certainty”. Suppose I choose 1150. If you require the exact year, then I’m only right if it was exactly 1150, and since it could be any of 100 years my probability is 1%. If you require within five years, then I’m right if it was any time between 1145 and 1155, so my probability is 10%. If you require within fifty years, then my probability is effectively 100%. All of those are potential “sane ranges”, but depending on which one you the correctly calibrated estimate could be anywhere from 1% to 100%.
Unless I am very confused, you might want to change the questions and hand-throw-out all the answers you received before now, since I don’t think they’re meaningful (except if interpreted as probability of being exactly right).
(Actually, it might be interesting to see how many people figure this out, in a train wreck sort of way.)
PS: I admit this is totally 100% my fault for not getting around to looking at it the five times you asked me to before this.
Yeah, you’re right.
Currently trying to figure out how to do that in the least intrusive way.
EDIT: Good news it turns out that I can edit the calibration question ‘answers’ after all. The ones where a range would make sense have been edited to include one. Questions such as “which is heavier” have not been because the ignorance prior should be fairly obvious.
Fri Mar 25 19:50:41 PDT 2016 | Answers on or before this date where the ranges have been added will be controlled for at analysis time.
If you throw out the data, I request you keep the thrown-out data somewhere else so I can see how people responded to the issue.
I don’t throw out data. Ever. I only control for it. (Well barring exceptional circumstances.)
Even if he threw out the data I have recurring storage snapshots happening behind the scenes (on the backing store for the OSes involved.)