A recent paper develops a conceptual model that retrodicts human social learning. They assume that asocial learning allows you adapt to the current environment, while social learning allows you to copy the adaptations that other agents have learned. Both can be increased by making larger brains, at the cost of increased resource requirements. What conditions lead to very good social learning?
First, we need high transmission fidelity, so that social learning is effective. Second, we need some asocial learning, in order to bootstrap—mimicking doesn’t help if the people you’re mimicking haven’t learned anything in the first place. Third, to incentivize larger brains, the environment needs to be rich enough that additional knowledge is actually useful. Finally, we need low reproductive skew, that is, individuals that are more adapted to the environment should have only a slight advantage over those who are less adapted. (High reproductive skew would select too strongly for high asocial learning.) This predicts pair bonding rather than a polygynous mating structure.
This story cuts against the arguments in Will AI See Sudden Progress? and Takeoff speeds: it seems like evolution “stumbled upon” high asocial and social learning and got a discontinuity in reproductive fitness of species. We should potentially also expect discontinuities in AI development.
We can also forecast the future of AI based on this story. Perhaps we need to be watching for the perfect combination of asocial and social learning techniques for AI, and once these components are in place, AI intelligence will develop very quickly and autonomously.
Planned opinion:
As the post notes, it is important to remember that this is one of many plausible accounts for human success, but I find it reasonably compelling. It moves me closer to the camp of “there will likely be discontinuities in AI development”, but not by much.
I’m more interested in what predictions about AI development we can make based on this model. I actually don’t think that this suggests that AI development will need both social and asocial learning: it seems to me that in this model, the need for social learning arises because of the constraints on brain size and the limited lifetimes. Neither of these constraints apply to AI—costs grow linearly with “brain size” (model capacity, maybe also training time) as opposed to superlinearly for human brains, and the AI need not age and die. So, with AI I expect that it would be better to optimize just for asocial learning, since you don’t need to mimic the transmission across lifetimes that was needed for humans.
I agree that the model doesn’t show that AI will need both asocial and social learning. Moreover, there is a core difference between the growth of the cost of brain size between humans and AI (sublinear [EDIT: super] vs linear). But in the world where AI dev faces hardware constraints, social learning will be much more useful. So AI dev could involve significant social learning as described in the post.
Moreover, there is a core difference between the growth of the cost of brain size between humans and AI (sublinear vs linear).
Actually, I was imagining that for humans the cost of brain size grows superlinearly. The paper you linked uses a quadratic function, and also tried an exponential and found similar results.
But in the world where AI dev faces hardware constraints, social learning will be much more useful.
Agreed if the AI uses social learning to learn from humans, but that only gets you to human-level AI. If you want to argue for something like fast takeoff to superintelligence, you need to talk about how the AI learns independently of humans, and in that setting social learning won’t be useful given linear costs.
E.g. Suppose that each unit of adaptive knowledge requires one unit of asocial learning. Every unit of learning costs $K, regardless of brain size, so that everything is linear. No matter how much social learning you have, the discovery of N units of knowledge is going to cost $KN, so the best thing you can do is put N units of asocial learning in a single brain/model so that you don’t have to pay any cost for social learning.
In contrast, if N units of asocial learning in a single brain costs $KN2, then having N units of asocial learning in a single brain/model is very expensive. You can instead have N separate brains each with 1 unit of asocial learning, for a total cost of $KN, and that is enough to discover the N units of knowledge. You can then invest a unit or two of social learning for each brain/model so that they can all accumulate the N units of knowledge, giving a total cost that is still linear in N.
I’m claiming that AI is more like the former while this paper’s model is more like the latter. Higher hardware constraints only changes the value of K, which doesn’t affect this analysis.
Planned summary:
A recent paper develops a conceptual model that retrodicts human social learning. They assume that asocial learning allows you adapt to the current environment, while social learning allows you to copy the adaptations that other agents have learned. Both can be increased by making larger brains, at the cost of increased resource requirements. What conditions lead to very good social learning?
First, we need high transmission fidelity, so that social learning is effective. Second, we need some asocial learning, in order to bootstrap—mimicking doesn’t help if the people you’re mimicking haven’t learned anything in the first place. Third, to incentivize larger brains, the environment needs to be rich enough that additional knowledge is actually useful. Finally, we need low reproductive skew, that is, individuals that are more adapted to the environment should have only a slight advantage over those who are less adapted. (High reproductive skew would select too strongly for high asocial learning.) This predicts pair bonding rather than a polygynous mating structure.
This story cuts against the arguments in Will AI See Sudden Progress? and Takeoff speeds: it seems like evolution “stumbled upon” high asocial and social learning and got a discontinuity in reproductive fitness of species. We should potentially also expect discontinuities in AI development.
We can also forecast the future of AI based on this story. Perhaps we need to be watching for the perfect combination of asocial and social learning techniques for AI, and once these components are in place, AI intelligence will develop very quickly and autonomously.
Planned opinion:
As the post notes, it is important to remember that this is one of many plausible accounts for human success, but I find it reasonably compelling. It moves me closer to the camp of “there will likely be discontinuities in AI development”, but not by much.
I’m more interested in what predictions about AI development we can make based on this model. I actually don’t think that this suggests that AI development will need both social and asocial learning: it seems to me that in this model, the need for social learning arises because of the constraints on brain size and the limited lifetimes. Neither of these constraints apply to AI—costs grow linearly with “brain size” (model capacity, maybe also training time) as opposed to superlinearly for human brains, and the AI need not age and die. So, with AI I expect that it would be better to optimize just for asocial learning, since you don’t need to mimic the transmission across lifetimes that was needed for humans.
Awesome, thanks for the super clean summary.
I agree that the model doesn’t show that AI will need both asocial and social learning. Moreover, there is a core difference between the growth of the cost of brain size between humans and AI (sublinear [EDIT: super] vs linear). But in the world where AI dev faces hardware constraints, social learning will be much more useful. So AI dev could involve significant social learning as described in the post.
Actually, I was imagining that for humans the cost of brain size grows superlinearly. The paper you linked uses a quadratic function, and also tried an exponential and found similar results.
Agreed if the AI uses social learning to learn from humans, but that only gets you to human-level AI. If you want to argue for something like fast takeoff to superintelligence, you need to talk about how the AI learns independently of humans, and in that setting social learning won’t be useful given linear costs.
E.g. Suppose that each unit of adaptive knowledge requires one unit of asocial learning. Every unit of learning costs $K, regardless of brain size, so that everything is linear. No matter how much social learning you have, the discovery of N units of knowledge is going to cost $KN, so the best thing you can do is put N units of asocial learning in a single brain/model so that you don’t have to pay any cost for social learning.
In contrast, if N units of asocial learning in a single brain costs $KN2, then having N units of asocial learning in a single brain/model is very expensive. You can instead have N separate brains each with 1 unit of asocial learning, for a total cost of $KN, and that is enough to discover the N units of knowledge. You can then invest a unit or two of social learning for each brain/model so that they can all accumulate the N units of knowledge, giving a total cost that is still linear in N.
I’m claiming that AI is more like the former while this paper’s model is more like the latter. Higher hardware constraints only changes the value of K, which doesn’t affect this analysis.