I would trust the aggregation algorithm on Metaculus more than an average (mostly because its performance is evaluated against an average). So I think that’s usually pretty decent.
I would normally trust it more, but it’s recently been doing way worse than the Metaculus crowd median (average log score 0.157 vs 0.117 over the sample of 20 yes/no questions that have resolved for me), and based on the details of the estimates that doesn’t look to me like it’s just bad luck. It does better on the whole set of questions, but I think still not much better than the median; I can’t find the analysis page at the moment.
based on the details of the estimates that doesn’t look to me like it’s just bad luck
For example:
There’s a question about whether the S&P 500 will end the year higher than it began. When the question closed, the index had increased from 2500 to 2750. The index has increased most years historically. But the Metaculus estimate was about 50%.
On this question, at the time of closing, 538′s estimate was 99+% and the Metaculus estimate was 66%. I don’t think Metaculus had significantly different information than 538.
I would trust the aggregation algorithm on Metaculus more than an average (mostly because its performance is evaluated against an average). So I think that’s usually pretty decent.
I would normally trust it more, but it’s recently been doing way worse than the Metaculus crowd median (average log score 0.157 vs 0.117 over the sample of 20 yes/no questions that have resolved for me), and based on the details of the estimates that doesn’t look to me like it’s just bad luck. It does better on the whole set of questions, but I think still not much better than the median; I can’t find the analysis page at the moment.
For example:
There’s a question about whether the S&P 500 will end the year higher than it began. When the question closed, the index had increased from 2500 to 2750. The index has increased most years historically. But the Metaculus estimate was about 50%.
On this question, at the time of closing, 538′s estimate was 99+% and the Metaculus estimate was 66%. I don’t think Metaculus had significantly different information than 538.