I’m partly responding to things people have said in conversation with me. For example, the thing Longs says that is a direct quote from one of my friends commenting on an early draft! I’ve been hearing things like this pretty often from a bunch of different people.
I’m also partly responding to Ajeya Cotra’s epic timelines report. It’s IMO the best piece of work on the topic there is, and it’s also the thing that bigshot AI safety people (like OpenPhil, Paul, Rohin, etc.) seem to take most seriously. I think it’s right about most things but one major disagreement I have with it is that it seems to put too much probability mass on “Lots of special sauce needed” hypotheses. Shorty’s position—the “not very much special sauce” position—applied to AI seems to be that we should anchor on the Human Lifetime anchor. If you think there’s probably a little special sauce but that it can be compensated for via e.g. longer training times and bigger NNs, then that’s something like the Short Horizon NN hypothesis. I consider Genome Anchor, Medium and Long-Horizon NN Anchor, and of course Evolution Anchor to be “lots of special sauce needed” views. In particular, all of these views involve, according to Ajeya, “Learning to Learn:” I’ll quote her in full:
We may need long horizons for meta-learning or other abilities that evolution selected for
Training a model with SGD to solve a task generally requires vastly more data and experience than a human would require to learn to do the same thing. For example, esports players generally train for a few years to reach professional level play at games like StarCraft and DOTA; on the other hand, AlphaStar was trained on 400,000 subjective years of StarCraft play, and the OpenAI Five DOTA model was trained on 7000 subjective years of DOTA. GPT-3 was trained on 300 billion tokens, which amounts to about 3000 subjective years of reading given typical human reading speeds; despite having seen many times more information than a human about almost any given topic, it is much less useful than a human for virtually all language-based jobs (programming, policymaking, research, etc).
I think that for a single model to have a transformative impact on its own, it would likely need to be able to learn new skills and concepts about as efficiently as a human, and much more efficiently than hand-written ML algorithms like SGD. For a model trained in 2020 to accelerate the prevailing rate of growth by 10x (causing the economy to double by ~2024), it seems like it would have to have capabilities broadly along the lines of one of the following:
Automate a wide swathe of jobs such that large parts of the economy can ~immediately transition to a rate of growth closer to the faster serial thinking speeds of AI workers, or
Speed up R&D progress for other potentially transformative technologies (e.g. atomically precise manufacturing, whole brain emulation, highly efficient space colonization, or the strong version of AGI itself) by much more than ten-fold, such that once the transformative model is trained, the relevant downstream technology can be developed and deployed in only a couple of additional years in expectation, and then that technology could raise the growth rate by ten-fold. For AI capable of speeding up R&D like this, I picture something like an “automated scientist/engineer” that can do the hardest parts of science and engineering work, including quickly learning about and incorporating novel ideas.
Both of these seem to require efficient learning in novel domains which would not have been represented fully in the training dataset. In the first case, the model would need to be a relatively close substitute for an arbitrary human and would therefore probably need to learn new skills on the job as efficiently as a human could. In the second case, the model would likely need to efficiently learn about how a complex research domain works with very little human assistance (as human researchers would not be able to keep up with the necessary pace).
Humans may learn more efficiently than SGD because we are able to use sophisticated heuristics and/or logical reasoning to determine how to update from a particular piece of information in a fine-grained way, whereas SGD simply executes a “one-size-fits-all” gradient update step for each data point. Given that SGD has been used for decades without improving dramatically in sample-efficiency, I think it is relatively unlikely that researchers will be able to hand-design a learning algorithm which is in the range of human-level sample efficiency.
Instead, I would guess that a transformative ML problem would involve meta-learning (that is, using a hand-written optimization algorithm such as SGD to find a model which itself uses its own internal process for learning new skills, a process which may be much more complex and sophisticated than the original hand-written algorithm).
My best guess is that human ability to learn new skills quickly was optimized by natural selection over many generations. Many smaller animals seem capable of learning new skills that were not directly found in their ancestral environment, e.g. bees, mice, octopi, squirrels, crows, dogs, chimps, etc.
The larger animals in particular seem to be able to learn complex new tasks over long periods of subjective time: for example, dogs are trained over a period of months to perform many relatively complex functions such as guiding the blind, herding sheep, assisting with a hunt, unearthing drugs or bombs, and so on. My understanding is that animals trained to perform in a circus also learn complex behaviors over a period of weeks or months. Larger animals seem to exhibit a degree of logical reasoning as well (e.g. the crow in the linked video above), which seems to help speed up their learning, although I’m less confident in this.
This makes me believe it’s likely that our brain’s architecture, our motivation and attention mechanisms, the course of brain development over infancy and childhood, synaptic plasticity mechanisms, and so on were optimized over hundreds of millions of generations for the ability to learn and perhaps reason effectively.
The average generation length was likely several months or years over the period of evolutionary history that seems like it could have been devoted to optimizing for animals which learn efficiently. I consider this a prima facie reason to believe that the effective horizon length for meta-learning—and possibly for training other cognitive abilities which were also selected over evolutionary time—may be in the range of multiple subjective months or years. It could be much lower in reality for various reasons (see below), but anchoring to generation times seems like a “naive” default.
Here I am not saying we should expect that training a transformative model would take as much computation as natural selection (that view is represented by the Evolution Anchor hypothesis which I place substantially less weight on than the Neural Network hypotheses). I am instead saying:
A transformative model would likely need to be able to learn new skills and concepts as efficiently as a human could.
Hand-written optimization algorithms such as SGD are currently much less efficient than human learning is, and don’t seem to be on track to improve dramatically over a short period of time, so training a model that can learn new things as efficiently as a human is likely to require meta-learning.
It seems likely that evolution selected humans over many generations to have good heuristics for learning efficiently. So naively, we should expect that it could take an amount of subjective time comparable to the average generation length in our evolutionary history to be able to tell which of two similar models is more efficient at learning new skills (or better at some other cognitive trait that evolution selected for over generations).
My understanding is that meta-learning has had only limited success so far, and there have not yet been strong demonstrations of meta-learning behaviors which would take a human multiple subjective minutes to learn how to do, such as playing a new video game. Under this hypothesis—assuming that training data is not a bottleneck -- the implicit explanation for the limited success of meta-learning would be some combination of a) our models have not been large enough, and b) our horizons have not been long enough.
This seems like a plausible explanation to me. Let’s estimate the cost of training a model to learn how to play a new video game as quickly as a human can:
Effective horizon length: Learning to play an unfamiliar video game well takes a typical human multiple hours of play; I will assume the effective horizon length for the meta-learning problem is one subjective hour.
Model FLOP / subj sec and parameter count: Even if our ML architectures are just as good as nature’s brain architectures, it seems plausible that models much smaller than the size of a mouse brain aren’t capable of learning to learn complex new behaviors at all—my understanding is that we have some solid evidence of mice learning complex behaviors, and more ambiguous evidence about smaller animals. According to Wikipedia, a mouse has about ~1e12 synapses in its brain, implying that its brain runs on ~1e12 FLOP/s. I will assume we need a model larger than the equivalent of a bee but smaller than the equivalent of a mouse (say at least ~3e9 parameters and 1e11 FLOP / subj sec) to perform well on the “learning to learn new video games” ML problem.
If the scaling behavior follows the estimate generated in Part 2, the amount of computation required to train a model that could quickly master a new video game should be (3600 subj sec) * (1e11 FLOP / subj sec) * (1700 * 1e11^0.8) = 2e25 FLOP. At ~1e17 FLOP per dollar, that would cost $200 million, which makes it unsurprising this hasn’t been successfully demonstrated yet, given that it is not particularly valuable.
Note that while meta-learning seems to me like the single most likely way that a transformative ML problem could turn out to have a long horizon, there may be other critical cognitive traits or abilities that were optimized by natural selection which may have an effective horizon length of several subjective months or longer.
I interpret her as making the non-bogus version of the argument from efficiency here. However, (and I worry that I’m being uncharitable?) I also suspect that the bogus version of the argument is sneaking in a little bit, she keeps talking about how evolution took millions of generations to do stuff, as if that’s relevant… I certainly think that even if she isn’t falling for the bogus arguments herself, it’s easy for people to fall for them, and this would make her conclusions seem much more reasonable than they are.
In particular, she assigns only 5% weight to the human lifetime anchor—the hypothesis that Shorty is promoting—and only 20% weight to the short-horizon NN anchor, which I think of as the “There’s some special sauce but we can find it with a few OOMs of searching and scaling up key variables” hypothesis. She assigns 75% of her weight to the various “There’s a lot of special sauce needed, we’re going to have to do a TON of search and/or have some brilliant new insights” hypotheses. In other words, the “Longs is right” hypotheses.
I think this is lopsided; much more weight should be on the lower-special-sauce anchors/’hypotheses. Why? Well, why not? We haven’t actually been presented with strong reason to think Longs is right about AI. There’s a bunch of bogus arguments which many people find seductive, but when you cut them away, we are left with… only the non-bogus argument Ajeya made / I sketched in Part 3. And that’s not a super convincing argument to me, in part because it feels like someone could have made a very similar argument in 1900 about airplane control or about understanding the principles of efficient flight. Meanwhile we have the example of birds and planes as precedent for Shorty being right sometimes...
I probably should have put this in the main post. Maybe I’ll make it its own post someday. I’d be interested to hear what Rohin and Ajeya think.
Thanks, and I look forward to seeing your reply!
I’m partly responding to things people have said in conversation with me. For example, the thing Longs says that is a direct quote from one of my friends commenting on an early draft! I’ve been hearing things like this pretty often from a bunch of different people.
I’m also partly responding to Ajeya Cotra’s epic timelines report. It’s IMO the best piece of work on the topic there is, and it’s also the thing that bigshot AI safety people (like OpenPhil, Paul, Rohin, etc.) seem to take most seriously. I think it’s right about most things but one major disagreement I have with it is that it seems to put too much probability mass on “Lots of special sauce needed” hypotheses. Shorty’s position—the “not very much special sauce” position—applied to AI seems to be that we should anchor on the Human Lifetime anchor. If you think there’s probably a little special sauce but that it can be compensated for via e.g. longer training times and bigger NNs, then that’s something like the Short Horizon NN hypothesis. I consider Genome Anchor, Medium and Long-Horizon NN Anchor, and of course Evolution Anchor to be “lots of special sauce needed” views. In particular, all of these views involve, according to Ajeya, “Learning to Learn:” I’ll quote her in full:
I interpret her as making the non-bogus version of the argument from efficiency here. However, (and I worry that I’m being uncharitable?) I also suspect that the bogus version of the argument is sneaking in a little bit, she keeps talking about how evolution took millions of generations to do stuff, as if that’s relevant… I certainly think that even if she isn’t falling for the bogus arguments herself, it’s easy for people to fall for them, and this would make her conclusions seem much more reasonable than they are.
In particular, she assigns only 5% weight to the human lifetime anchor—the hypothesis that Shorty is promoting—and only 20% weight to the short-horizon NN anchor, which I think of as the “There’s some special sauce but we can find it with a few OOMs of searching and scaling up key variables” hypothesis. She assigns 75% of her weight to the various “There’s a lot of special sauce needed, we’re going to have to do a TON of search and/or have some brilliant new insights” hypotheses. In other words, the “Longs is right” hypotheses.
I think this is lopsided; much more weight should be on the lower-special-sauce anchors/’hypotheses. Why? Well, why not? We haven’t actually been presented with strong reason to think Longs is right about AI. There’s a bunch of bogus arguments which many people find seductive, but when you cut them away, we are left with… only the non-bogus argument Ajeya made / I sketched in Part 3. And that’s not a super convincing argument to me, in part because it feels like someone could have made a very similar argument in 1900 about airplane control or about understanding the principles of efficient flight. Meanwhile we have the example of birds and planes as precedent for Shorty being right sometimes...
I probably should have put this in the main post. Maybe I’ll make it its own post someday. I’d be interested to hear what Rohin and Ajeya think.