This is another example of how matching specialized human reasoning skill seems routinely feasible with search guided by 100M scale networks trained for a task a human would spend years mastering. These tasks seem specialized, but it’s plausible all breadth of human activity can be covered with a reasonable number of such areas of specialization. What’s currently missing is automation of formulation and training of systems specialized in any given skill.
The often touted surprisingly good human sample efficiency might just mean that when training is set up correctly, it’s sufficient to train models of size comparable to the amount of external text data that a human might need to master a skill, rather than models of the size comparable to a human brain. This doesn’t currently work for training systems that autonomously do research and produce new AI experiments and papers, and in practice the technology might take a very different route. But once it does work, surpassing human performance might fail to require millions of GPUs even for training.
This is another example of how matching specialized human reasoning skill seems routinely feasible with search guided by 100M scale networks trained for a task a human would spend years mastering. These tasks seem specialized, but it’s plausible all breadth of human activity can be covered with a reasonable number of such areas of specialization. What’s currently missing is automation of formulation and training of systems specialized in any given skill.
The often touted surprisingly good human sample efficiency might just mean that when training is set up correctly, it’s sufficient to train models of size comparable to the amount of external text data that a human might need to master a skill, rather than models of the size comparable to a human brain. This doesn’t currently work for training systems that autonomously do research and produce new AI experiments and papers, and in practice the technology might take a very different route. But once it does work, surpassing human performance might fail to require millions of GPUs even for training.