I think part of the “calculus” being run by the AI safety folks is as follows:
there are certainly both some dumb ways humanity could die (ie, AI-enabled bioweapon terrorism that could have easily been prevented by some RLHF + basic checks at protein synthesis companies), as well as some very tricky, advanced ways (AI takeover by a superintelligence with a very subtle form of misalignment, using lots of brilliant deception, etc)
It seems like the dumber ways are generally more obvious / visible to other people (like military generals or the median voter), wheras these people are skeptical of the trickier paths (ie, not taking the prospect of agentic, superintelligent AI seriously; figuring alignment will probably continue to be easy even as AI gets smarter, not believing that you could ever use AI to do useful AI research, etc).
The trickier paths also seem like we might need to get a longer head start on them, think about them more carefully, etc.
Therefore, I (one of the rare believers in things like “deceptive misalignment is likely” or “superintelligence is possible”) should work on the trickier paths; others (like the US military, or other government agencies, or whatever) will eventually recognize and patch the dumber paths.
I think part of the “calculus” being run by the AI safety folks is as follows:
there are certainly both some dumb ways humanity could die (ie, AI-enabled bioweapon terrorism that could have easily been prevented by some RLHF + basic checks at protein synthesis companies), as well as some very tricky, advanced ways (AI takeover by a superintelligence with a very subtle form of misalignment, using lots of brilliant deception, etc)
It seems like the dumber ways are generally more obvious / visible to other people (like military generals or the median voter), wheras these people are skeptical of the trickier paths (ie, not taking the prospect of agentic, superintelligent AI seriously; figuring alignment will probably continue to be easy even as AI gets smarter, not believing that you could ever use AI to do useful AI research, etc).
The trickier paths also seem like we might need to get a longer head start on them, think about them more carefully, etc.
Therefore, I (one of the rare believers in things like “deceptive misalignment is likely” or “superintelligence is possible”) should work on the trickier paths; others (like the US military, or other government agencies, or whatever) will eventually recognize and patch the dumber paths.