Humans are capable of solving conceptually difficult problems, so they do. An easier path might be possible that doesn’t depend on such capabilities, and doesn’t stall for their lack, like evolution doesn’t stall for lack of any mind at all. If there is more potential for making models smarter alien tigers by scaling RL in o1-like post-training, and the scaling proceeds to 1 gigawatt and then 35 gigawatt training systems, it might well be sufficient to get an engineer AI that can improve such systems further, at 400x and then 10,000x the compute of GPT-4.
Before o1, there was a significant gap, the mysterious absence of System 2 capabilities, with only vague expectation that they might emerge or become easier to elicit from scaled up base models. This uncertainty no longer gates engineering capabilities of AIs. I’m still unsure that scaling directly can make AIs capabile of novel conceptual thought, but AIs becoming able to experimentally iterate on AI designs seems likely, and that in turn seems sufficient to eventually mutate these designs towards remaining missing capabilities.
(It’s useful to frame most ideas as exploratory engineering rather than forecasting. The question of whether something can happen, or can be done, doesn’t need to be contextualized within the question of whether it will happen or will be done. Physical experiments are done under highly contrived conditions, and similarly we can conduct thought experiments or conceptual arguments under fantastical or even physically impossible conditions. Thus I think Carl Shulman’s human levelAGI world is a valid exploration of the future of AI, even though I don’t believe that most of what he describes happens in actuality before superintelligence changes the premise. It serves as a strong argument for industrial and economic growth driven by AGI, even though it almost entirely consists of describing events that can’t possibly happen.)
Humans are capable of solving conceptually difficult problems, so they do. An easier path might be possible that doesn’t depend on such capabilities, and doesn’t stall for their lack, like evolution doesn’t stall for lack of any mind at all. If there is more potential for making models smarter alien tigers by scaling RL in o1-like post-training, and the scaling proceeds to 1 gigawatt and then 35 gigawatt training systems, it might well be sufficient to get an engineer AI that can improve such systems further, at 400x and then 10,000x the compute of GPT-4.
Before o1, there was a significant gap, the mysterious absence of System 2 capabilities, with only vague expectation that they might emerge or become easier to elicit from scaled up base models. This uncertainty no longer gates engineering capabilities of AIs. I’m still unsure that scaling directly can make AIs capabile of novel conceptual thought, but AIs becoming able to experimentally iterate on AI designs seems likely, and that in turn seems sufficient to eventually mutate these designs towards remaining missing capabilities.
(It’s useful to frame most ideas as exploratory engineering rather than forecasting. The question of whether something can happen, or can be done, doesn’t need to be contextualized within the question of whether it will happen or will be done. Physical experiments are done under highly contrived conditions, and similarly we can conduct thought experiments or conceptual arguments under fantastical or even physically impossible conditions. Thus I think Carl Shulman’s human level AGI world is a valid exploration of the future of AI, even though I don’t believe that most of what he describes happens in actuality before superintelligence changes the premise. It serves as a strong argument for industrial and economic growth driven by AGI, even though it almost entirely consists of describing events that can’t possibly happen.)