Regarding 14⁄15, I felt that we were probably under-reacting, but “general consensus” is tricky. We were in the home stretch of the Trump presidency so I figured the baseline odds of “consensus” on anything were extremely low.
I’m kicking myself on #16 - I don’t know enough about epidemiology to make such a strong guess.
I’m kicking myself on #16 - I don’t know enough about epidemiology to make such a strong guess.
Yeah, I did a similar thing on #38 where I was similarly overconfident on an economy question which I don’t know nearly enough about.
On #16 itself I was lower than I should have been because I was using “virus” as a reference class rather than “respiratory virus” which was an obvious mistake looking back at it.
Is the rule supposed to be symmetric around 50%? I used ln(p) - ln(.5) because Scott wrote:
“I scored these using a logarthmic scoring rule, adjusted so that guessing 50-50 always gave zero points.”
However, this doesn’t square with his second statement:
“Getting everything maximally right gives a score of about 14; guessing 50-50 for everything gives a score of 0, getting everything maximally wrong gives a score of negative infinity.”
It looks like you’re using the correct formula but maybe with a mistake of what the “p” in the formula means so that your scores on questions where the result was “false” are incorrect.
I think you maybe used ln(probability put on “true”)-ln(.5) and then multiplied the result by −1 if the actual answer was false?
The formulation Scott used was ln(probability put on the correct answer)-ln(.5)
So for q3 for example the calculation shouldn’t be
That looks right to me. If so, and if I’ve done the calculations right, the actual score should be (not +3.34 but) −1.89, just a little bit better than Bucky’s score according to Scott. (Except that #18 -- whether Scott went back to working in the office—seems to be missing; perhaps you didn’t bother predicting on that one because it seemed too Scott-specific? So comparison against others who did predict that one will be misleading unless you remove it from their score. Scott, Zvi and Bucky all lost quite a few points on #18.)
Yeah, I didn’t actually answer q18 either (possibly knite maybe used my list as a basis?) for exactly that reason. Scott just put me in as the same as him for that question for the purposes of making an apples-to-apples comparison which seemed fine—no idea what I would have put if I had answered!
Welcome to the predictions fun!
Im impressed with how little you put on 14&15, those were particularly good predictions IMO.
I think there might be an error on your calculation sheet—for instance your score for 3 should be the same as your score for 5?
Regarding 14⁄15, I felt that we were probably under-reacting, but “general consensus” is tricky. We were in the home stretch of the Trump presidency so I figured the baseline odds of “consensus” on anything were extremely low.
I’m kicking myself on #16 - I don’t know enough about epidemiology to make such a strong guess.
Yeah, I did a similar thing on #38 where I was similarly overconfident on an economy question which I don’t know nearly enough about.
On #16 itself I was lower than I should have been because I was using “virus” as a reference class rather than “respiratory virus” which was an obvious mistake looking back at it.
Is the rule supposed to be symmetric around 50%? I used ln(p) - ln(.5) because Scott wrote:
“I scored these using a logarthmic scoring rule, adjusted so that guessing 50-50 always gave zero points.”
However, this doesn’t square with his second statement:
“Getting everything maximally right gives a score of about 14; guessing 50-50 for everything gives a score of 0, getting everything maximally wrong gives a score of negative infinity.”
Do you know what the correct scoring rule is?
It looks like you’re using the correct formula but maybe with a mistake of what the “p” in the formula means so that your scores on questions where the result was “false” are incorrect.
I think you maybe used ln(probability put on “true”)-ln(.5) and then multiplied the result by −1 if the actual answer was false?
The formulation Scott used was ln(probability put on the correct answer)-ln(.5)
So for q3 for example the calculation shouldn’t be
(ln(0.1)−ln(0.5))×−1=1.61
but should be
ln(1−0.1)−ln(0.5)=0.59
That looks right to me. If so, and if I’ve done the calculations right, the actual score should be (not +3.34 but) −1.89, just a little bit better than Bucky’s score according to Scott. (Except that #18 -- whether Scott went back to working in the office—seems to be missing; perhaps you didn’t bother predicting on that one because it seemed too Scott-specific? So comparison against others who did predict that one will be misleading unless you remove it from their score. Scott, Zvi and Bucky all lost quite a few points on #18.)
Yeah, I didn’t actually answer q18 either (possibly knite maybe used my list as a basis?) for exactly that reason. Scott just put me in as the same as him for that question for the purposes of making an apples-to-apples comparison which seemed fine—no idea what I would have put if I had answered!