The fact that progress on existing environments (Go, ALE-57, etc) isn’t bottlenecked by environments doesn’t seem like particularly useful evidence. The question is whether we could be making much more progress towards AGI with environments that were more conducive to developing AGI. The fact that we’re running out of “headline” challenges along the lines of Go and Starcraft is one reason to think that having better environments would make a big difference—although to be clear, the main focus of my post is on the coming decades, and the claim that environments are currently a bottleneck does seem much weaker.
More concretely, is it possible to construct some dataset on which our current methods would get significantly closer to AGI than they are today? I think that’s plausible—e.g. perhaps we could take the linguistic corpus that GPT-3 was trained on, and carefully annotate what counts as good reasoning and what doesn’t. (In some ways this is what reward modelling is trying to do—but that focuses more on alignment than capabilities.)
Or another way of putting it: suppose we gave the field of deep learning 10,000x current compute and algorithms that are 10 years ahead of today. Would people know what to apply them to, in order to get much closer to AGI? If not, this also suggests that environments will be a bottleneck unless someone focuses on them within the next decade.
The fact that progress on existing environments (Go, ALE-57, etc) isn’t bottlenecked by environments doesn’t seem like particularly useful evidence. The question is whether we could be making much more progress towards AGI with environments that were more conducive to developing AGI. The fact that we’re running out of “headline” challenges along the lines of Go and Starcraft is one reason to think that having better environments would make a big difference—although to be clear, the main focus of my post is on the coming decades, and the claim that environments are currently a bottleneck does seem much weaker.
More concretely, is it possible to construct some dataset on which our current methods would get significantly closer to AGI than they are today? I think that’s plausible—e.g. perhaps we could take the linguistic corpus that GPT-3 was trained on, and carefully annotate what counts as good reasoning and what doesn’t. (In some ways this is what reward modelling is trying to do—but that focuses more on alignment than capabilities.)
Or another way of putting it: suppose we gave the field of deep learning 10,000x current compute and algorithms that are 10 years ahead of today. Would people know what to apply them to, in order to get much closer to AGI? If not, this also suggests that environments will be a bottleneck unless someone focuses on them within the next decade.