I suspect that “progress toward human level AI” is extremely non-linear in the outputs, not just the inputs. I think the ranges between “dog” and “human”, and between “human” and “unequivocally superhuman” are pretty similar and narrow on some absolute scale: perhaps 19, 20, and 21 respectively in some arbitrary units.
We have developed some narrow AI capabilities that are superhuman, but in terms of more general intelligence we’re still well short of “dog”. I wouldn’t be very surprised if we were at about 10-15 on the previous fictional scale, up from maybe 2-3 back in the 80′s.
Superficially it doesn’t look like a lot of progress is being made on human-level general intelligence, because our best efforts are in general terms still stupider than the dumbest dogs and that fact hasn’t changed in the last 40 years. We’ve broadened the fraction of tasks where AI can do almost as well as a human or better by maybe a couple of percent, which doesn’t look like much progress.
But that’s exactly what we should expect. Many of the tasks we’re interested in have “sharp” evaluation curves, where they require multiple capabilities and anything less than human performance in any one capability required for the task will lead to a score near zero.
If this model has any truth to it, by the time we get to 19 on this general intelligence scale we’ll probably still be bemoaning (or laughing at) how dumb our AIs are. Even while in many more respects they will have superior capabilities, and right on the verge of becoming unequivocally superhuman.
Right now it looks like your 19-21 scale corresponds to something like log(# of parameters) (basically every scaling graph with parameters you’ll see uses this as its x-axis). So it still requires exponential increase in inputs to drive a linear increase on that scale.
Oh yes, it’s very likely that inputs are also non-linear.
The non-linearity of outputs I was referring to though was the likelihood of “sharp” tasks that require near-human capability in multiple aspects to score anything non-negligible in evaluation.
I suspect that “progress toward human level AI” is extremely non-linear in the outputs, not just the inputs. I think the ranges between “dog” and “human”, and between “human” and “unequivocally superhuman” are pretty similar and narrow on some absolute scale: perhaps 19, 20, and 21 respectively in some arbitrary units.
We have developed some narrow AI capabilities that are superhuman, but in terms of more general intelligence we’re still well short of “dog”. I wouldn’t be very surprised if we were at about 10-15 on the previous fictional scale, up from maybe 2-3 back in the 80′s.
Superficially it doesn’t look like a lot of progress is being made on human-level general intelligence, because our best efforts are in general terms still stupider than the dumbest dogs and that fact hasn’t changed in the last 40 years. We’ve broadened the fraction of tasks where AI can do almost as well as a human or better by maybe a couple of percent, which doesn’t look like much progress.
But that’s exactly what we should expect. Many of the tasks we’re interested in have “sharp” evaluation curves, where they require multiple capabilities and anything less than human performance in any one capability required for the task will lead to a score near zero.
If this model has any truth to it, by the time we get to 19 on this general intelligence scale we’ll probably still be bemoaning (or laughing at) how dumb our AIs are. Even while in many more respects they will have superior capabilities, and right on the verge of becoming unequivocally superhuman.
Right now it looks like your 19-21 scale corresponds to something like log(# of parameters) (basically every scaling graph with parameters you’ll see uses this as its x-axis). So it still requires exponential increase in inputs to drive a linear increase on that scale.
Oh yes, it’s very likely that inputs are also non-linear.
The non-linearity of outputs I was referring to though was the likelihood of “sharp” tasks that require near-human capability in multiple aspects to score anything non-negligible in evaluation.