(Aside: Although tricky to put human ability on a cardinal scale, normal-distribution properties for things like working memory suggest cognitive ability (however cashed out) isn’t power law distributed.
Almost all scales in psychometrics are normalized, and the ones that are not normalized usually show very lopsided distributions. An interesting illustration here is the original Stanford-Binet IQ test scale, which just gave children a set of questions, and then divided the resulting score by the average for children of that age (and then multiplied it by 100), and which had very wide distributions with the 90th percentile of scores or so being a factor of 15 apart.
I don’t know which working memory scale Greg is referring to here, but I would be quite surprised if that scale isn’t manually normalized, and would expect various forms of working memory measures to vary drastically between different people. As an example, the digit span distribution in this paper is clearly log-normally distributed (or some similar distribution), but definitely not normally distributed:
I’m aware of normalisation, hence I chose things which have some sort of ‘natural cardinal scale’ (i.e. ‘how many Raven’s do you get right’ doesn’t really work, but ‘how many things can you keep in mind at once’ is better, albeit imperfect).
Not all skew entails a log-normal (or some similar—assumedly heavy tailed) distribution. This applies to your graph for digit span you cite here. The mean of the data is around 5, and the SD is around 2. Having ~11% at +1SD (7) and about 3% at +2SD (9) is a lot closer to normal distribution land (or, given this is count data, a pretty well-behaved poisson/slightly overdispersed binomial) than a hypothetical log normal. Given log normality, one should expect a dramatically higher maximum score when you increase the sample size from 78 in the cited study to 2400 or so. Yet in the standardization sample of the WAIS III of this size no individual had greater than 9 in forward digit span (and no one higher than 8 in reverse). (This is, I assume, the foundation for the famous ‘7 plus or minus 2’ claim.)
A lot turns on ‘vary dramatically’, but I think on most commonsense uses of this would not be it. I’d take reaction time data to be similar—although there is a ‘long tail’, this is a long tail of worse performance—and the tail isn’t that long. So I don’t buy claims I occasionally see made along the lines of ‘Einstein was just miles smarter than a merely average physicist’.
Huh, I notice that I am confused about noone in the sample having a larger digit span than 9. Do we know whether they didn’t just stop measuring after 9?
I was unaware of the range restriction, which could well compress SD. That said, if you take the ‘9’ scorers as ‘9 or more’, then you get something like this (using 20-25)
Mean value is around 7 (6.8), 7% get 9 or more, suggesting 9 is at or around +1.5SD assuming normality, so when you get a sample size in the thousands, you should start seeing scores at 11 or so (+3SD) - I wouldn’t be startled to find Ben has this level of ability. But scores at (say) 15 or higher (+6SD) should only be seen with extraordinarily rarely.
If you use log-normal assumptions, you should expect something like if +1.5SD is 2, 3SD is around 6 (i.e. ~13), and 4.5SD would give scores at 21 or so.
An unfortunate challenge at picking at the tails here is one can train digit span—memory athletes drill this and I understand the record lies in the three figures.
Perhaps a natural test would be getting very smart but training naive people (IMOers?) to try this. If they’re consistently scoring 15+, this is hard to reconcile with normalish assumptions (digit span wouldn’t correlate perfectly with mathematical ability, so lots of 6 sigma+ results look weird), and vice versa.
I would be happy to take a bet that took a random sample of people that we knew (let’s say 10) and saw whether their responses fit more with a log-normal or a normal distribution, though I do guess this would be quite indiscriminate, since we are looking for divergence in the tails.
Oli just gave me the test as described on wikipedia, and I got all the way up to 11. According to Greg’s world model, I’m in at least the 0.05th percentile (better than 2,400 random students), but given a normal distribution that expects 0 at 10 with a sample of 2,400, I must be way higher than that. (If anyone can do the maths, would be appreciated, I’d guess I’m like more than 1 in a million tho. According to Greg’s world-model.)
Added: Extra info, I started visualising the first 6 digits (in 2 groups of 3) and remembering the rest in my audio memory.
Almost all scales in psychometrics are normalized, and the ones that are not normalized usually show very lopsided distributions. An interesting illustration here is the original Stanford-Binet IQ test scale, which just gave children a set of questions, and then divided the resulting score by the average for children of that age (and then multiplied it by 100), and which had very wide distributions with the 90th percentile of scores or so being a factor of 15 apart.
I don’t know which working memory scale Greg is referring to here, but I would be quite surprised if that scale isn’t manually normalized, and would expect various forms of working memory measures to vary drastically between different people. As an example, the digit span distribution in this paper is clearly log-normally distributed (or some similar distribution), but definitely not normally distributed:
https://www.researchgate.net/figure/Figure-Distribution-of-digit-numbers-in-the-backward-digit-span-test_7664779
I’m aware of normalisation, hence I chose things which have some sort of ‘natural cardinal scale’ (i.e. ‘how many Raven’s do you get right’ doesn’t really work, but ‘how many things can you keep in mind at once’ is better, albeit imperfect).
Not all skew entails a log-normal (or some similar—assumedly heavy tailed) distribution. This applies to your graph for digit span you cite here. The mean of the data is around 5, and the SD is around 2. Having ~11% at +1SD (7) and about 3% at +2SD (9) is a lot closer to normal distribution land (or, given this is count data, a pretty well-behaved poisson/slightly overdispersed binomial) than a hypothetical log normal. Given log normality, one should expect a dramatically higher maximum score when you increase the sample size from 78 in the cited study to 2400 or so. Yet in the standardization sample of the WAIS III of this size no individual had greater than 9 in forward digit span (and no one higher than 8 in reverse). (This is, I assume, the foundation for the famous ‘7 plus or minus 2’ claim.)
http://www.sciencedirect.com/science/article/pii/S0887617701001767#TBL2
A lot turns on ‘vary dramatically’, but I think on most commonsense uses of this would not be it. I’d take reaction time data to be similar—although there is a ‘long tail’, this is a long tail of worse performance—and the tail isn’t that long. So I don’t buy claims I occasionally see made along the lines of ‘Einstein was just miles smarter than a merely average physicist’.
Huh, I notice that I am confused about noone in the sample having a larger digit span than 9. Do we know whether they didn’t just stop measuring after 9?
This random blogpost suggests that they stop at 9: https://pumpkinperson.com/2015/11/19/the-iq-of-daniel-seligman-part-5-digit-span-subtest/
I was unaware of the range restriction, which could well compress SD. That said, if you take the ‘9’ scorers as ‘9 or more’, then you get something like this (using 20-25)
Mean value is around 7 (6.8), 7% get 9 or more, suggesting 9 is at or around +1.5SD assuming normality, so when you get a sample size in the thousands, you should start seeing scores at 11 or so (+3SD) - I wouldn’t be startled to find Ben has this level of ability. But scores at (say) 15 or higher (+6SD) should only be seen with extraordinarily rarely.
If you use log-normal assumptions, you should expect something like if +1.5SD is 2, 3SD is around 6 (i.e. ~13), and 4.5SD would give scores at 21 or so.
An unfortunate challenge at picking at the tails here is one can train digit span—memory athletes drill this and I understand the record lies in the three figures.
Perhaps a natural test would be getting very smart but training naive people (IMOers?) to try this. If they’re consistently scoring 15+, this is hard to reconcile with normalish assumptions (digit span wouldn’t correlate perfectly with mathematical ability, so lots of 6 sigma+ results look weird), and vice versa.
Quick sanity check:
4.5SD = roughly 1 in 300,000 (according to wikipedia)
UK population = roughly 50 million
So there’d be 50 * 3 = 150 people in the UK who should be able to get scores at ~21 or more. Which seems quite plausible to me.
Also I know a few IMO people, I bet we could test this.
I would be happy to take a bet that took a random sample of people that we knew (let’s say 10) and saw whether their responses fit more with a log-normal or a normal distribution, though I do guess this would be quite indiscriminate, since we are looking for divergence in the tails.
I would take a bet that if there were a hypothetical dataset that would extend further, that the maximum among 2400 participants would at least be 12.
Oli just gave me the test as described on wikipedia, and I got all the way up to 11. According to Greg’s world model, I’m in at least the 0.05th percentile (better than 2,400 random students), but given a normal distribution that expects 0 at 10 with a sample of 2,400, I must be way higher than that. (If anyone can do the maths, would be appreciated, I’d guess I’m like more than 1 in a million tho. According to Greg’s world-model.)
Added: Extra info, I started visualising the first 6 digits (in 2 groups of 3) and remembering the rest in my audio memory.