The most obvious-to-me flaw with the plan of “hang out in the slightly superhuman range for a few decades and try to slowly get better alignment work done, possibly with the help of the AI” is that it involves no one ever turning up an AI even a little bit.
That level of coordination isn’t completely infeasible but it doesn’t seem remotely reliable.
100%, if I thought we had other options I’d obviously choose them.
The only reason this might be even hypothetically possible is self-interest, if we can create really broad social consensus about the difficulty of alignment. No one is trying to kill themselves.
The most obvious-to-me flaw with the plan of “hang out in the slightly superhuman range for a few decades and try to slowly get better alignment work done, possibly with the help of the AI” is that it involves no one ever turning up an AI even a little bit.
That level of coordination isn’t completely infeasible but it doesn’t seem remotely reliable.
100%, if I thought we had other options I’d obviously choose them.
The only reason this might be even hypothetically possible is self-interest, if we can create really broad social consensus about the difficulty of alignment. No one is trying to kill themselves.