I don’t see much disagreement between the two sources. The Vox article doesn’t claim that there is much reason for selecting the top 2% rather than the top 1% or the top 4% or whatever. And the SSC article doesn’t deny that the people who scored in the top 2% (and are thereby labeled “Superforecasters”) systematically do better than most at forecasting.
I’m puzzled by the use of the term “power law distribution”. I think that the GJP measured forecasting performance using Brier scores, and Brier scores are always between 0 and 1, which is the wrong shape for a fat-tailed distribution. And the next sentence (which begins “that is”) isn’t describing anything specific to power law distributions. So probably the Vox article is just misusing the term.
Hmm, thanks for pointing that out about Brier scores. The Vox article cites https://www.vox.com/2015/8/20/9179657/tetlock-forecasting for its “power law” claim, but that piece says nothing about power laws. It does have a graph which depicts a wide gap between “superforecasters” and “top-team individuals” in years 2 and 3 of the project, and not in year 1. But my understanding is that this is because the superforecasters were put together on elite teams after the first year, so I think the graph is a bit misleading.
I do think there’s disagreement between the sources — when I read sentences like this from the Vox article
Tetlock and his collaborators have run studies involving tens of thousands of participants and have discovered that prediction follows a power law distribution. That is, most people are pretty bad at it, but a few (Tetlock, in a Gladwellian twist, calls them “superforecasters”) appear to be systematically better than most at predicting world events … Tetlock even found that superforecasters — smart, well-informed, but basically normal people with no special information — outperformed CIA analysts by about 30 percent in forecasting world events.
I definitely imagine looking at a graph of everyone’s performance on the predictions and noticing a cluster who are discontinuously much better than everyone else. I would be surprised if the authors of the piece didn’t imagine this as well. The article they link to does exactly what Scott warns against, saying “Tetlock’s team found out that some people were ‘superforecasters’.”
I definitely imagine looking at a graph of everyone’s performance on the predictions and noticing a cluster who are discontinuously much better than everyone else. I would be surprised if the authors of the piece didn’t imagine this as well.
Some evidence against this is that they described it as being a “power law” distribution, which is continuous and doesn’t have these kinds of clusters. (It just goes way way up as you move to the right.)
If you had a power law distribution, it would still be accurate to say that “a few are better than most”, even though there isn’t a discontinuous break anywhere.
EDIT: It seems to me that most things like this follow approximately continuous distributions. And so whenever you hear someone talking about something like this you should assume it’s continuous unless it’s super clear that it’s not (and that should be a surprising fact in need of explanation!). But note that people will often talk about it in misleading ways, because for the sake of discussion it’s often simpler to talk about it as if there are these discrete groups. So just because people are talking about it as if there are discrete groups does not mean they actually think there are discrete groups. I think that’s what happened here.
I don’t see much disagreement between the two sources. The Vox article doesn’t claim that there is much reason for selecting the top 2% rather than the top 1% or the top 4% or whatever. And the SSC article doesn’t deny that the people who scored in the top 2% (and are thereby labeled “Superforecasters”) systematically do better than most at forecasting.
I’m puzzled by the use of the term “power law distribution”. I think that the GJP measured forecasting performance using Brier scores, and Brier scores are always between 0 and 1, which is the wrong shape for a fat-tailed distribution. And the next sentence (which begins “that is”) isn’t describing anything specific to power law distributions. So probably the Vox article is just misusing the term.
Hmm, thanks for pointing that out about Brier scores. The Vox article cites https://www.vox.com/2015/8/20/9179657/tetlock-forecasting for its “power law” claim, but that piece says nothing about power laws. It does have a graph which depicts a wide gap between “superforecasters” and “top-team individuals” in years 2 and 3 of the project, and not in year 1. But my understanding is that this is because the superforecasters were put together on elite teams after the first year, so I think the graph is a bit misleading.
(Citation: the paper https://stanford.edu/~knutson/nfc/mellers15.pdf)
I do think there’s disagreement between the sources — when I read sentences like this from the Vox article
I definitely imagine looking at a graph of everyone’s performance on the predictions and noticing a cluster who are discontinuously much better than everyone else. I would be surprised if the authors of the piece didn’t imagine this as well. The article they link to does exactly what Scott warns against, saying “Tetlock’s team found out that some people were ‘superforecasters’.”
Some evidence against this is that they described it as being a “power law” distribution, which is continuous and doesn’t have these kinds of clusters. (It just goes way way up as you move to the right.)
If you had a power law distribution, it would still be accurate to say that “a few are better than most”, even though there isn’t a discontinuous break anywhere.
EDIT: It seems to me that most things like this follow approximately continuous distributions. And so whenever you hear someone talking about something like this you should assume it’s continuous unless it’s super clear that it’s not (and that should be a surprising fact in need of explanation!). But note that people will often talk about it in misleading ways, because for the sake of discussion it’s often simpler to talk about it as if there are these discrete groups. So just because people are talking about it as if there are discrete groups does not mean they actually think there are discrete groups. I think that’s what happened here.