There’s a trap here where the more you think about how to prevent bad outcomes from AGI, the more you realize you need to understand current AI capabilities and limitations, and to do that there is no substitute for developing and trying to improve current AI!
A secondary trap is that preventing unaligned AGI probably will require lots of limited aligned helper AIs which you have to figure out how to build, again pushing you in the direction of improving current AI.
The strategy of “getting top AGI researchers to stop” is a tragedy of the commons: They can be replaced by other researchers with fewer scruples. In principle TotC can be solved, but it’s hard. Assuming that effort succeeds, how feasible would it be to set up a monitoring regime to prevent covert AGI development?
There’s a trap here where the more you think about how to prevent bad outcomes from AGI, the more you realize you need to understand current AI capabilities and limitations, and to do that there is no substitute for developing and trying to improve current AI!
A secondary trap is that preventing unaligned AGI probably will require lots of limited aligned helper AIs which you have to figure out how to build, again pushing you in the direction of improving current AI.
The strategy of “getting top AGI researchers to stop” is a tragedy of the commons: They can be replaced by other researchers with fewer scruples. In principle TotC can be solved, but it’s hard. Assuming that effort succeeds, how feasible would it be to set up a monitoring regime to prevent covert AGI development?
Top researchers are not easy to replace. Without the 0.1st percentile of researchers, progress would be slowed much more than 0.1%