I remember reading a Gwern post that shows a lot of studies on human ability, and they show very similar if not better results for my theory that humans abilities have a very narrow range.
You are probably thinking of my mentions of Wechsler 1935 that if you compare the extremes (defined as best/worst out of 1000, ie. ±3 SD) of human capabilities (defined as broadly as possible, including eg running) where the capability has a cardinal scale, the absolute range is surprisingly often around 2-3x. There’s no obvious reason that it should be 2-3x rather than 10x or 100x or lots of other numbers*, so it certainly seems like the human range is quite narrow and we are, from a big picture view going from viruses to hypothetical galaxy-spanning superintelligences, stamped out from the same mold. (There is probably some sort of normality + evolution + mutation-load justification for this but I continue to wait for someone to propose any quantitative argument which can explain why it’s 2-3x.)
You could also look at parts of cognitive tests which do allow absolute, not merely relative, measures, like vocabulary or digit span. If you look at, say, backwards digit span and note that most people have a backwards digit span of only ~4.5 and the range is pretty narrow (±<1 digit SD?), obviously there’s “plenty of room at the top” and mnemonists can train to achieve digit spans of hundreds and computers go to digit spans of trillions (at least in the sense of storing on hard drives as an upper bound). Similarly, vocabularies or reaction time: English has millions of words, of which most people will know maybe 25k or closer to 1% than 100% while a neural net like GPT-3 probably knows several times that and has no real barrier to being trained to the point where it just memorizes the OED & other dictionaries; or reaction time tests like reacting to a bright light will take 20-100ms across all humans no matter how greased-lightning their reflexes while if (for some reason) you designed an electronic circuit optimized for that task it’d be more like 0.000000001ms (terahertz circuits on the order of picoseconds, and there’s also more exotic stuff like photonics).
* for example, in what you might call ‘compound’ capabilities like ‘number of papers published’, the range will probably be much larger than ‘2-3x’ (most people published 0 papers and the most prolific author out of 1000 people probably publishes 100+), so it’s not like there’s any a priori physical limit on most of these. But these could just break down into atomic: if paper publishing is log-normal because it’s intelligence X ideas X work X … = publications, then a range of 2-3x in each one would quickly give you the observed skewed range. But the question is where does that consistent 2-3x comes from, why couldn’t it be utterly dominated by one step where there’s a range of 1-10,000, say?
That’s what I was thinking about. Do you still have it on gwern.net? And can you link it please?
Some important implications here:
Eliezer’s spectrum is far more right than Dragon god’s spectrum of intelligence, and the claim of a broad spectrum needs to be reframed more narrowly.
This does suggest that AI intelligence could be much better than RL humans, even with limitations. That is, we should expect quite large capabilities differentials compared to human on human capabilities differentials.
You are probably thinking of my mentions of Wechsler 1935 that if you compare the extremes (defined as best/worst out of 1000, ie. ±3 SD) of human capabilities (defined as broadly as possible, including eg running) where the capability has a cardinal scale, the absolute range is surprisingly often around 2-3x. There’s no obvious reason that it should be 2-3x rather than 10x or 100x or lots of other numbers*, so it certainly seems like the human range is quite narrow and we are, from a big picture view going from viruses to hypothetical galaxy-spanning superintelligences, stamped out from the same mold. (There is probably some sort of normality + evolution + mutation-load justification for this but I continue to wait for someone to propose any quantitative argument which can explain why it’s 2-3x.)
You could also look at parts of cognitive tests which do allow absolute, not merely relative, measures, like vocabulary or digit span. If you look at, say, backwards digit span and note that most people have a backwards digit span of only ~4.5 and the range is pretty narrow (±<1 digit SD?), obviously there’s “plenty of room at the top” and mnemonists can train to achieve digit spans of hundreds and computers go to digit spans of trillions (at least in the sense of storing on hard drives as an upper bound). Similarly, vocabularies or reaction time: English has millions of words, of which most people will know maybe 25k or closer to 1% than 100% while a neural net like GPT-3 probably knows several times that and has no real barrier to being trained to the point where it just memorizes the OED & other dictionaries; or reaction time tests like reacting to a bright light will take 20-100ms across all humans no matter how greased-lightning their reflexes while if (for some reason) you designed an electronic circuit optimized for that task it’d be more like 0.000000001ms (terahertz circuits on the order of picoseconds, and there’s also more exotic stuff like photonics).
* for example, in what you might call ‘compound’ capabilities like ‘number of papers published’, the range will probably be much larger than ‘2-3x’ (most people published 0 papers and the most prolific author out of 1000 people probably publishes 100+), so it’s not like there’s any a priori physical limit on most of these. But these could just break down into atomic: if paper publishing is log-normal because it’s intelligence X ideas X work X … = publications, then a range of 2-3x in each one would quickly give you the observed skewed range. But the question is where does that consistent 2-3x comes from, why couldn’t it be utterly dominated by one step where there’s a range of 1-10,000, say?
That’s what I was thinking about. Do you still have it on gwern.net? And can you link it please?
Some important implications here:
Eliezer’s spectrum is far more right than Dragon god’s spectrum of intelligence, and the claim of a broad spectrum needs to be reframed more narrowly.
This does suggest that AI intelligence could be much better than RL humans, even with limitations. That is, we should expect quite large capabilities differentials compared to human on human capabilities differentials.