I also find it odd that Bio Anchors does not talk much about data requirements, and I‘m glad you pointed that out.
Thus, to get timelines, we’d also need to estimate what dataset/environments are necessary for training AGI. But I’m not sure we know what these datasets/environments look like.
I suspect this could be easier to answer than we think. After all, if you consider a typical human, they only have a certain number of skills, and they only have a certain number of experiences. The skills and experiences may be numerous, but they are finite. If we can enumerate and analyze all of them, we may be able to get a lot of insight into what is “necessary for training AGI”.
If I were to try to come up with an estimate, here is one way I might approach it:
What are all the tasks that a typical human (from a given background) can do?
This could be a very long list, so it might make sense to enumerate the tasks/skills at only a fairly high level at first
For each task, why are humans able to do it? What experiences have humans learned from, such that they are able to do the task? What is the minimal set of experiences, such that if a human was not able to experience and learn from them, they would not be able to do the task?
The developmental psychology literature could be very helpful here
For each task that humans can do, what is currently preventing AI systems from learning to do the task?
Maybe AI systems aren’t yet being trained with all the experiences that humans rely on for the task.
Maybe all the relevant experiences are already available for use in training, but our current model architectures and training paradigms aren’t good enough
Though I suspect that once people know exactly what training data humans require for a skill, it won’t be too hard to come up with a working architecture
Maybe all the relevant experiences are available, and there is an architecture that is highly likely to work, but we just don’t yet have the resources to collect enough data or train a sufficiently high-capacity model
I also find it odd that Bio Anchors does not talk much about data requirements, and I‘m glad you pointed that out.
I suspect this could be easier to answer than we think. After all, if you consider a typical human, they only have a certain number of skills, and they only have a certain number of experiences. The skills and experiences may be numerous, but they are finite. If we can enumerate and analyze all of them, we may be able to get a lot of insight into what is “necessary for training AGI”.
If I were to try to come up with an estimate, here is one way I might approach it:
What are all the tasks that a typical human (from a given background) can do?
This could be a very long list, so it might make sense to enumerate the tasks/skills at only a fairly high level at first
For each task, why are humans able to do it? What experiences have humans learned from, such that they are able to do the task? What is the minimal set of experiences, such that if a human was not able to experience and learn from them, they would not be able to do the task?
The developmental psychology literature could be very helpful here
For each task that humans can do, what is currently preventing AI systems from learning to do the task?
Maybe AI systems aren’t yet being trained with all the experiences that humans rely on for the task.
Maybe all the relevant experiences are already available for use in training, but our current model architectures and training paradigms aren’t good enough
Though I suspect that once people know exactly what training data humans require for a skill, it won’t be too hard to come up with a working architecture
Maybe all the relevant experiences are available, and there is an architecture that is highly likely to work, but we just don’t yet have the resources to collect enough data or train a sufficiently high-capacity model