The other issue with pausing training of large foundation models (the type of pause we might actually achieve) is that it could change the direction of AGI research to a less-alignable type of first AGIs.
This is related to but different from the direction argument made by Russel Thor in that comment thread is. You don’t have to think that LLMs are an inefficient path to AGI to worry about changing the direction. LLMs might be an equally or more efficient path that happens to be safer. Indeed, I think it is. The instructability and translucency of LLM cognition makes them an ideal base model out of which to build “real” agentic, self-improving AGI that can be aligned to human intentions and whose cognitive processes can be understood and monitored relatively well.
On the other hand, pausing new training runs might redirect a lot of effort into fleshing out current foundation models into more useful cognitive architectures. These systems seem like our best chance of alignment. Progress on foundation models might remove the fairly tight connection between LLMs cognition and the language they emit. That would make them less “translucent” and thereby less alignable. So that’s a reason to favor a pause.
Neither of those arguments are commonly raised; I have yet to write a post about it.
The other issue with pausing training of large foundation models (the type of pause we might actually achieve) is that it could change the direction of AGI research to a less-alignable type of first AGIs.
This is related to but different from the direction argument made by Russel Thor in that comment thread is. You don’t have to think that LLMs are an inefficient path to AGI to worry about changing the direction. LLMs might be an equally or more efficient path that happens to be safer. Indeed, I think it is. The instructability and translucency of LLM cognition makes them an ideal base model out of which to build “real” agentic, self-improving AGI that can be aligned to human intentions and whose cognitive processes can be understood and monitored relatively well.
On the other hand, pausing new training runs might redirect a lot of effort into fleshing out current foundation models into more useful cognitive architectures. These systems seem like our best chance of alignment. Progress on foundation models might remove the fairly tight connection between LLMs cognition and the language they emit. That would make them less “translucent” and thereby less alignable. So that’s a reason to favor a pause.
Neither of those arguments are commonly raised; I have yet to write a post about it.