Considering how much people talk about superforecasters, how come there aren’t more public sources of superforecasts? There’s prediction markets and sites like ElectionBettingOdds that make it easier to read their odds as probabilities, but only for limited questions. There’s Metaculus, but it only shows a crowd median (with a histogram of predictions) and in some cases the result of an aggregation algorithm that I don’t trust very much. There’s PredictionBook, but it’s not obvious how to extract a good single probability estimate from it. Both prediction markets and Metaculus are competitive and disincentivize public cooperation. What else is there if I want to know something like what the probability of war with Iran is?
I think the Metaculus crowd median is among the highest-quality predictions out there. Especially when someone goes through all the questions where they’re confident the median is off, and makes comments pointing this out. I used to do this, some months back when there were more short term questions on Metaculus and more questions where I differed from the community. When you made a bunch of comments of this type a month back on Metaculus, that covered most of the ‘holes’, in my opinion, and now there are only a few questions where I differ from the median prediction.
Another source of predictions is from the IARPA Geoforecasting Challenge, where if you’re competing you have access to hundreds of MTurk human predictions through an API. The quality of the predictions are not as great, and there are some questions where the MTurk crowd is way off. But they do have a question on whether Iran will execute or be targeted in a national military attack.
I agree that it’s quite possible to beat the best publicly available forecasts. I’ve been wanting to work together on a small team to do this (where I imagine the same set of people would debate and make the predictions). If anyone’s interested in this, I’m datscilly on Metaculus and can be reached at [my name] at gmail.
I think one could greatly outperform the best publicly available forecasts through collaboration between 1) some people good at arguing and looking for info and 2) someone good at evaluating arguments and aggregating evidence. Maybe just a forum thread where a moderator keeps a percentage estimate updated in the top post.
I would trust the aggregation algorithm on Metaculus more than an average (mostly because its performance is evaluated against an average). So I think that’s usually pretty decent.
I would normally trust it more, but it’s recently been doing way worse than the Metaculus crowd median (average log score 0.157 vs 0.117 over the sample of 20 yes/no questions that have resolved for me), and based on the details of the estimates that doesn’t look to me like it’s just bad luck. It does better on the whole set of questions, but I think still not much better than the median; I can’t find the analysis page at the moment.
based on the details of the estimates that doesn’t look to me like it’s just bad luck
For example:
There’s a question about whether the S&P 500 will end the year higher than it began. When the question closed, the index had increased from 2500 to 2750. The index has increased most years historically. But the Metaculus estimate was about 50%.
On this question, at the time of closing, 538′s estimate was 99+% and the Metaculus estimate was 66%. I don’t think Metaculus had significantly different information than 538.
Considering how much people talk about superforecasters, how come there aren’t more public sources of superforecasts? There’s prediction markets and sites like ElectionBettingOdds that make it easier to read their odds as probabilities, but only for limited questions. There’s Metaculus, but it only shows a crowd median (with a histogram of predictions) and in some cases the result of an aggregation algorithm that I don’t trust very much. There’s PredictionBook, but it’s not obvious how to extract a good single probability estimate from it. Both prediction markets and Metaculus are competitive and disincentivize public cooperation. What else is there if I want to know something like what the probability of war with Iran is?
I think the Metaculus crowd median is among the highest-quality predictions out there. Especially when someone goes through all the questions where they’re confident the median is off, and makes comments pointing this out. I used to do this, some months back when there were more short term questions on Metaculus and more questions where I differed from the community. When you made a bunch of comments of this type a month back on Metaculus, that covered most of the ‘holes’, in my opinion, and now there are only a few questions where I differ from the median prediction.
Another source of predictions is from the IARPA Geoforecasting Challenge, where if you’re competing you have access to hundreds of MTurk human predictions through an API. The quality of the predictions are not as great, and there are some questions where the MTurk crowd is way off. But they do have a question on whether Iran will execute or be targeted in a national military attack.
I agree that it’s quite possible to beat the best publicly available forecasts. I’ve been wanting to work together on a small team to do this (where I imagine the same set of people would debate and make the predictions). If anyone’s interested in this, I’m datscilly on Metaculus and can be reached at [my name] at gmail.
Maybe Good Judgement Open? I don’t know how they actually get their probabilities though.
I think one could greatly outperform the best publicly available forecasts through collaboration between 1) some people good at arguing and looking for info and 2) someone good at evaluating arguments and aggregating evidence. Maybe just a forum thread where a moderator keeps a percentage estimate updated in the top post.
I would trust the aggregation algorithm on Metaculus more than an average (mostly because its performance is evaluated against an average). So I think that’s usually pretty decent.
I would normally trust it more, but it’s recently been doing way worse than the Metaculus crowd median (average log score 0.157 vs 0.117 over the sample of 20 yes/no questions that have resolved for me), and based on the details of the estimates that doesn’t look to me like it’s just bad luck. It does better on the whole set of questions, but I think still not much better than the median; I can’t find the analysis page at the moment.
For example:
There’s a question about whether the S&P 500 will end the year higher than it began. When the question closed, the index had increased from 2500 to 2750. The index has increased most years historically. But the Metaculus estimate was about 50%.
On this question, at the time of closing, 538′s estimate was 99+% and the Metaculus estimate was 66%. I don’t think Metaculus had significantly different information than 538.