It’s much more the same than a lot of prosaic safety though, right?
Let me put it this way: If an AI can’t achieve catastrophe on that order of magnitude, it also probably cannot do something truly existential.
One of the issues this runs into is if a misaligned AI is playing possum, and so doesn’t attempt lesser catastrophes until it can pull off a true takeover. I nonetheless though think this framing points generally at the right type of work (understood that others may disagree of course)
Not confident, but I think that “AIs that cause your civilization problems” and “AIs that overthrow your civilization” may be qualitatively different kinds of AIs. Regardlesss, existential threats are the most important thing here, and we just have a short term (‘x-risk’) that refers to that work.
And anyway I think the ‘catastrophic’ term is already being used to obfuscate, as Anthropic uses it exclusively on their website / in their papers, literally never talking about extinction or disempowerment[1], and we shouldn’t let them get away with that by also adopting their worse terminology.
It’s much more the same than a lot of prosaic safety though, right?
Let me put it this way: If an AI can’t achieve catastrophe on that order of magnitude, it also probably cannot do something truly existential.
One of the issues this runs into is if a misaligned AI is playing possum, and so doesn’t attempt lesser catastrophes until it can pull off a true takeover. I nonetheless though think this framing points generally at the right type of work (understood that others may disagree of course)
Not confident, but I think that “AIs that cause your civilization problems” and “AIs that overthrow your civilization” may be qualitatively different kinds of AIs. Regardlesss, existential threats are the most important thing here, and we just have a short term (‘x-risk’) that refers to that work.
And anyway I think the ‘catastrophic’ term is already being used to obfuscate, as Anthropic uses it exclusively on their website / in their papers, literally never talking about extinction or disempowerment[1], and we shouldn’t let them get away with that by also adopting their worse terminology.
(And they use the term ‘existential’ 3 times in oblique ways that barely count.)