Given a training environment or dataset, a training algorithm, an optimiser, and a model class capable of implementing an AGI (with the right parameters), there are two interesting questions we might ask about how conducive that environment is for training an AGI. The first is: how much do AGIs from that model class outperform non-AGIs? The second is: how straightforward is the path to reaching an AGI? We can visualise these questions in terms of the loss landscape of those models when evaluated on the training environment. The first asks how low the set of AGIs is, compared with the rest of the landscape. The second asks how favourable the paths through that loss landscape to get to AGIs are—that is, do the local gradients usually point in the right direction, and how deep are the local minima?
Some people believe that there are many environments in which AGIs can be reached via favourable paths in the loss landscape and dramatically outperform non-AGIs; let’s call this the easy paths hypothesis. By contrast, the hard paths hypothesis is that it’s rare for environments (even complex meta-environments consisting of many separate tasks) to straightforwardly incentivise the development of general intelligence. This would suggest that specific environmental features will be necessary to prevent most models from getting stuck in local minima where they only possess narrow, specialised cognitive skills. There has been a range of speculation on what such features might be—perhaps multi-agent autocurricula, or realistic simulations, or specific types of human feedback. I’ll discuss some of these possibilities later in the post.
This spectrum is complicated by its dependence on the model class, training algorithm, and choice of optimiser. If we had a perfect optimiser, then the hilliness of the loss landscape wouldn’t matter. For now, I’m imagining using optimisers fairly similar to current stochastic gradient descent. Meanwhile, I’m assuming in this post that (in accordance with Rich Sutton’s bitter lesson) our models and training algorithms won’t contain very strong inductive biases. In other words: we’ll develop powerful function approximators, but which functions they approximate will primarily be determined by their training environments (and possibly also regularisation, as I’ll discuss later).
Arguments for the hard paths hypothesis
When predicting AGI timelines, a lot of people focus on progress in compute and algorithms. But I think that environments are more important than they may at first seem, because we have reason to take the hard paths hypothesis seriously. The history of AI is full of realisations that solving high-level tasks is easier than we expect, because those tasks don’t require as much general intelligence as we thought (as highlighted by Moravec’s paradox). Chess doesn’t, Go doesn’t, Starcraft doesn’t. Rather, when we train on these sorts of environments, we get agents with narrow intelligence that is only useful in that environment. The lesson here is that neural networks are very good at doing exactly what we give them feedback to do—even when that feedback is random, large neural networks are capable of simply memorising a lot of information!
Another way of phrasing this point: each time we evaluate the training loss, that’s based on a model’s performance on a specific task. So we don’t have any principled way of rewarding models for doing so in a way that generalises to a wide range of unseen tasks. This is in theory a similar problem to making models generalise from the training set to the test set, but in practice much broader—since for an AI to be generally intelligent, it will need to be able to generalise to tasks that are very different to the ones it was trained on. How might AI researchers ensure this generalisation occurs? We can try to train them on a wide range of tasks, but we’re never going to be able to train them on anywhere near the full diversity of tasks that we want AGIs to be able to tackle. An alternative is regularisation, which is often used to prevent models from overfitting to their environments. I do expect regularisation to be very helpful overall, but we currently have little understanding of the extent to which regularisation is capable of converting hard path environments to easy path environments. In particular, it’s unclear what the relationship is between “preventing overfitting” and the type of broad generalisation between tasks that humans are capable of—for example, doing mathematics despite never having evolved for it.
I suspect that many people intuitively discount the hard paths hypothesis because humans managed to become generally intelligent without anyone designing our environment to encourage that. However, this objection is very vulnerable to anthropic considerations—that of course our environment proved sufficient, otherwise we wouldn’t be here asking the question! In other words, as long as the universe contains some environments which give rise to general intelligence, generally-intelligent observers will always find themselves arising in those environments, and never in the environments in which life got trapped in narrow-intelligence local optima. So we can’t infer from the existence of our ancestral environment how much of a “lucky coincidence” our own general intelligence is, or how many difficult-to-recreate components were crucial during its development.
Perhaps the example of humans would be strong evidence for the easy paths hypothesis if, even after scrutinising our ancestral environment carefully, we can’t think of any such components. But I don’t think we’re in that situation. There are many traits of ourselves or our ancestral environment which arguably helped steer us towards general intelligence (even ignoring the ones which primarily impacted brain function via brain size). An incomplete list: large group sizes, calorific benefits of (cooked) meat, need for coordination during hunting, sexual selection (and possibility of infidelity), extended childhood and parental relationships, benefits of teaching, benefits of detecting norm violations, dexterous fingers, vocal ability, high-fidelity senses. If it turns out that we only became generally intelligent because all of these variables came together just right, that suggests that designing easy-path training environments will be tricky. This difficulty is exacerbated by the fact that our understanding of human evolution is very incomplete, and so there are probably a bunch more factors which could be comparably important to the ones I described.
Here’s another way of thinking about my overall argument. Consider the thought experiment of scaling up the brain of a given animal species to have the same number of neurons as humans, while magically maintaining it at the same size and weight as their current brain, and requiring no energy to run (thus removing physical difficulties). Suppose we then fixed their bodies in their current forms, only allowing brain architecture and content to evolve (the precise details are a little fiddly, but I think the core idea makes sense). Almost any species we did this to would evolve additional narrow intellectual capabilities which are useful in their environments, since it’s unlikely that their current brain size is optimal when energy costs are removed—but how many of them would reach general intelligence with the span of a hundred million years or so (assuming no interactions with humans)? If easy-path environments are common, many should get there; if rare, then few. I expect that most animals in that situation wouldn’t reach sufficient levels of general intelligence to do advanced mathematics or figure out scientific laws. That might be because most are too solitary for communication skills to be strongly selected for, or because language is not very valuable even for social species (as suggested by the fact that none of them have even rudimentary languages). Or because most aren’t physically able to use complex tools, or because they’d quickly learn to exploit other animals enough that further intelligence isn’t very helpful, or… If true, this implies that we should take seriously the hypothesis that it will be difficult to build easy-path environments.
We should note, though, that even this thought experiment is not immune from anthropic considerations. Clearly the answer for chimpanzees will be highly correlated with the answer for humans. And strong correlations might remain even for animals that are very far from us on the evolutionary tree. For example, suppose sexual selection (which has ancient evolutionary origins) is a key requirement for developing general intelligence. Or imagine that the limited storage capacity of DNA imposes the regularisation necessary to push animals out of narrow-intelligence local optima. The fact that these traits wouldn’t occur by default in training environments for AGIs weakens the thought experiment’s ability to provide evidence in favour of the easy paths hypothesis.
We might dispute the relevance of thought experiments about biological environments by arguing that, unlike evolution, AI development is supervised by researchers who will be deliberately designing environments to make paths to AGI easier. For example, AIs trained on gigabytes of language data won’t need to derive language from scratch like humans did. However, other relevant features may be less straightforward to identify and implement. For instance, one argument for why most animals wouldn’t reach general intelligence under the conditions described above is that they don’t have sufficiently flexible appendages to benefit from general tool use. Yet I expect that implementing flexible interactions in simulations will be very difficult—even state-of-the-art video games are far from supporting this. As another example, it’s possible that when training an AI to produce novel scientific theories, we’ll need its training dataset to include thousands or millions of example theories in order to develop its ability to do scientific reasoning in a general way (as opposed to merely learning to regurgitate our existing scientific knowledge). Even if this isn’t totally infeasible, it’ll significantly slow down the process of developing those capabilities, compared with the possible world in which training on the entire internet provides an easy-path environment. What’s more, we simply don’t know right now which additional features will make a big difference, and it may take us a long time to figure that out. Anyone who thinks that they can identify a set of tasks which strongly incentivise the development of general intelligence should wonder how their position differs from the expectations of previous AI researchers who also expected that the tasks they worked on would require general intelligence to solve.
I’m still very uncertain about how likely different types of environments are to contain easy paths to AGI. But the hard paths hypothesis seems plausible—and the limitations which anthropic considerations place on our ability to refute it should push us towards expecting the development of AGI to take longer than we otherwise expected.
Environments as a bottleneck in AGI development
Given a training environment or dataset, a training algorithm, an optimiser, and a model class capable of implementing an AGI (with the right parameters), there are two interesting questions we might ask about how conducive that environment is for training an AGI. The first is: how much do AGIs from that model class outperform non-AGIs? The second is: how straightforward is the path to reaching an AGI? We can visualise these questions in terms of the loss landscape of those models when evaluated on the training environment. The first asks how low the set of AGIs is, compared with the rest of the landscape. The second asks how favourable the paths through that loss landscape to get to AGIs are—that is, do the local gradients usually point in the right direction, and how deep are the local minima?
Some people believe that there are many environments in which AGIs can be reached via favourable paths in the loss landscape and dramatically outperform non-AGIs; let’s call this the easy paths hypothesis. By contrast, the hard paths hypothesis is that it’s rare for environments (even complex meta-environments consisting of many separate tasks) to straightforwardly incentivise the development of general intelligence. This would suggest that specific environmental features will be necessary to prevent most models from getting stuck in local minima where they only possess narrow, specialised cognitive skills. There has been a range of speculation on what such features might be—perhaps multi-agent autocurricula, or realistic simulations, or specific types of human feedback. I’ll discuss some of these possibilities later in the post.
This spectrum is complicated by its dependence on the model class, training algorithm, and choice of optimiser. If we had a perfect optimiser, then the hilliness of the loss landscape wouldn’t matter. For now, I’m imagining using optimisers fairly similar to current stochastic gradient descent. Meanwhile, I’m assuming in this post that (in accordance with Rich Sutton’s bitter lesson) our models and training algorithms won’t contain very strong inductive biases. In other words: we’ll develop powerful function approximators, but which functions they approximate will primarily be determined by their training environments (and possibly also regularisation, as I’ll discuss later).
Arguments for the hard paths hypothesis
When predicting AGI timelines, a lot of people focus on progress in compute and algorithms. But I think that environments are more important than they may at first seem, because we have reason to take the hard paths hypothesis seriously. The history of AI is full of realisations that solving high-level tasks is easier than we expect, because those tasks don’t require as much general intelligence as we thought (as highlighted by Moravec’s paradox). Chess doesn’t, Go doesn’t, Starcraft doesn’t. Rather, when we train on these sorts of environments, we get agents with narrow intelligence that is only useful in that environment. The lesson here is that neural networks are very good at doing exactly what we give them feedback to do—even when that feedback is random, large neural networks are capable of simply memorising a lot of information!
Another way of phrasing this point: each time we evaluate the training loss, that’s based on a model’s performance on a specific task. So we don’t have any principled way of rewarding models for doing so in a way that generalises to a wide range of unseen tasks. This is in theory a similar problem to making models generalise from the training set to the test set, but in practice much broader—since for an AI to be generally intelligent, it will need to be able to generalise to tasks that are very different to the ones it was trained on. How might AI researchers ensure this generalisation occurs? We can try to train them on a wide range of tasks, but we’re never going to be able to train them on anywhere near the full diversity of tasks that we want AGIs to be able to tackle. An alternative is regularisation, which is often used to prevent models from overfitting to their environments. I do expect regularisation to be very helpful overall, but we currently have little understanding of the extent to which regularisation is capable of converting hard path environments to easy path environments. In particular, it’s unclear what the relationship is between “preventing overfitting” and the type of broad generalisation between tasks that humans are capable of—for example, doing mathematics despite never having evolved for it.
I suspect that many people intuitively discount the hard paths hypothesis because humans managed to become generally intelligent without anyone designing our environment to encourage that. However, this objection is very vulnerable to anthropic considerations—that of course our environment proved sufficient, otherwise we wouldn’t be here asking the question! In other words, as long as the universe contains some environments which give rise to general intelligence, generally-intelligent observers will always find themselves arising in those environments, and never in the environments in which life got trapped in narrow-intelligence local optima. So we can’t infer from the existence of our ancestral environment how much of a “lucky coincidence” our own general intelligence is, or how many difficult-to-recreate components were crucial during its development.
Perhaps the example of humans would be strong evidence for the easy paths hypothesis if, even after scrutinising our ancestral environment carefully, we can’t think of any such components. But I don’t think we’re in that situation. There are many traits of ourselves or our ancestral environment which arguably helped steer us towards general intelligence (even ignoring the ones which primarily impacted brain function via brain size). An incomplete list: large group sizes, calorific benefits of (cooked) meat, need for coordination during hunting, sexual selection (and possibility of infidelity), extended childhood and parental relationships, benefits of teaching, benefits of detecting norm violations, dexterous fingers, vocal ability, high-fidelity senses. If it turns out that we only became generally intelligent because all of these variables came together just right, that suggests that designing easy-path training environments will be tricky. This difficulty is exacerbated by the fact that our understanding of human evolution is very incomplete, and so there are probably a bunch more factors which could be comparably important to the ones I described.
Here’s another way of thinking about my overall argument. Consider the thought experiment of scaling up the brain of a given animal species to have the same number of neurons as humans, while magically maintaining it at the same size and weight as their current brain, and requiring no energy to run (thus removing physical difficulties). Suppose we then fixed their bodies in their current forms, only allowing brain architecture and content to evolve (the precise details are a little fiddly, but I think the core idea makes sense). Almost any species we did this to would evolve additional narrow intellectual capabilities which are useful in their environments, since it’s unlikely that their current brain size is optimal when energy costs are removed—but how many of them would reach general intelligence with the span of a hundred million years or so (assuming no interactions with humans)? If easy-path environments are common, many should get there; if rare, then few. I expect that most animals in that situation wouldn’t reach sufficient levels of general intelligence to do advanced mathematics or figure out scientific laws. That might be because most are too solitary for communication skills to be strongly selected for, or because language is not very valuable even for social species (as suggested by the fact that none of them have even rudimentary languages). Or because most aren’t physically able to use complex tools, or because they’d quickly learn to exploit other animals enough that further intelligence isn’t very helpful, or… If true, this implies that we should take seriously the hypothesis that it will be difficult to build easy-path environments.
We should note, though, that even this thought experiment is not immune from anthropic considerations. Clearly the answer for chimpanzees will be highly correlated with the answer for humans. And strong correlations might remain even for animals that are very far from us on the evolutionary tree. For example, suppose sexual selection (which has ancient evolutionary origins) is a key requirement for developing general intelligence. Or imagine that the limited storage capacity of DNA imposes the regularisation necessary to push animals out of narrow-intelligence local optima. The fact that these traits wouldn’t occur by default in training environments for AGIs weakens the thought experiment’s ability to provide evidence in favour of the easy paths hypothesis.
We might dispute the relevance of thought experiments about biological environments by arguing that, unlike evolution, AI development is supervised by researchers who will be deliberately designing environments to make paths to AGI easier. For example, AIs trained on gigabytes of language data won’t need to derive language from scratch like humans did. However, other relevant features may be less straightforward to identify and implement. For instance, one argument for why most animals wouldn’t reach general intelligence under the conditions described above is that they don’t have sufficiently flexible appendages to benefit from general tool use. Yet I expect that implementing flexible interactions in simulations will be very difficult—even state-of-the-art video games are far from supporting this. As another example, it’s possible that when training an AI to produce novel scientific theories, we’ll need its training dataset to include thousands or millions of example theories in order to develop its ability to do scientific reasoning in a general way (as opposed to merely learning to regurgitate our existing scientific knowledge). Even if this isn’t totally infeasible, it’ll significantly slow down the process of developing those capabilities, compared with the possible world in which training on the entire internet provides an easy-path environment. What’s more, we simply don’t know right now which additional features will make a big difference, and it may take us a long time to figure that out. Anyone who thinks that they can identify a set of tasks which strongly incentivise the development of general intelligence should wonder how their position differs from the expectations of previous AI researchers who also expected that the tasks they worked on would require general intelligence to solve.
I’m still very uncertain about how likely different types of environments are to contain easy paths to AGI. But the hard paths hypothesis seems plausible—and the limitations which anthropic considerations place on our ability to refute it should push us towards expecting the development of AGI to take longer than we otherwise expected.