Would you accept “they plan to use extremely powerful AI to institute a minimalist, AI-enabled world government focused on preventing the development of other AI systems” as a summary?
No. Because I don’t think that was specified or is necessary for a pivotal act. You could leave all existing government structures intact and simply create an invincible system that causes any GPU farm larger than a certain size to melt. Or something akin to that that doesn’t require replacing existing governments, but is a quite narrow intervention.
It wasn’t specified but I think they strongly implied it would be that or something equivalently coercive. The “melting GPUs” plan was explicitly not a pivotal act but rather something with the required level of difficulty, and it was implied that the actual pivotal act would be something further outside the political Overton window. When you consider the ways “melting GPUs” would be insufficient a plan like this is the natural conclusion.
doesn’t require replacing existing governments
I don’t think you would need to replace existing governments. Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy. Get existing governments to help you, or at least not interfere, via some mix of coercion and trade. Sort of a feudal arrangement with a minimalist central power.
Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy.
That to me is a very very non-central case of “take over the world”, if it is one at all.
This is about “what would people think when they hear that description” and I could be wrong, but I expect “the plan is to take over the world” summary would lead people to expect “replace governments” level of interference, not “coerce/trade to ensure this specific policy”—and there’s a really really big difference between the two.
I think this whole debate is missing the point I was trying to make. My claim was that it’s often useful to classify actions which tend to lead you to having a lot of power as “structural power-seeking” regardless of what your motivations for those actions are. Because it’s very hard to credibly signal that you’re accumulating power for the right reasons, and so the defense mechanisms will apply to you either way.
In this case MIRI was trying to accumulate a lot of power, and claiming that they were aiming to use it in the “right way” (do a pivotal act) rather than the “wrong way” (replacing governments). But my point above is that this sort of claim is largely irrelevant to defense mechanisms against power-seeking.
(Now, in this case, MIRI was pursuing a type of power that was too weird to trigger many defense mechanisms, though it did trigger some “this is a cult” defense mechanisms. But the point cross-applies to other types of power that they, and others in AI safety, are pursuing.)
I don’t super buy this. I don’t think MIRI was trying to accumulate a lot of power. In my model of the world they were trying to design a blueprint for some institution or project that would mostly have highly conditional power, that they would personally not wield.
In the metaphor of classical governance, I think what MIRI was doing was much more “design a blueprint for a governance agency” not “put themselves in charge of a governance agency”. Designing a blueprint is not a particularly power-seeking move, especially if you expect other people to implement it.
I got your point and think it’s valid and I don’t object to calling MIRI structurally power-seeking to the extent they wanted to execute a pivotal act themselves (Habryka claims they weren’t, I’m not knowledgeable on that front).
I still think it’s important to push back against a false claim that someone had the goal of taking over the world.
No. Because I don’t think that was specified or is necessary for a pivotal act. You could leave all existing government structures intact and simply create an invincible system that causes any GPU farm larger than a certain size to melt. Or something akin to that that doesn’t require replacing existing governments, but is a quite narrow intervention.
It wasn’t specified but I think they strongly implied it would be that or something equivalently coercive. The “melting GPUs” plan was explicitly not a pivotal act but rather something with the required level of difficulty, and it was implied that the actual pivotal act would be something further outside the political Overton window. When you consider the ways “melting GPUs” would be insufficient a plan like this is the natural conclusion.
I don’t think you would need to replace existing governments. Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy. Get existing governments to help you, or at least not interfere, via some mix of coercion and trade. Sort of a feudal arrangement with a minimalist central power.
That to me is a very very non-central case of “take over the world”, if it is one at all.
This is about “what would people think when they hear that description” and I could be wrong, but I expect “the plan is to take over the world” summary would lead people to expect “replace governments” level of interference, not “coerce/trade to ensure this specific policy”—and there’s a really really big difference between the two.
I think this whole debate is missing the point I was trying to make. My claim was that it’s often useful to classify actions which tend to lead you to having a lot of power as “structural power-seeking” regardless of what your motivations for those actions are. Because it’s very hard to credibly signal that you’re accumulating power for the right reasons, and so the defense mechanisms will apply to you either way.
In this case MIRI was trying to accumulate a lot of power, and claiming that they were aiming to use it in the “right way” (do a pivotal act) rather than the “wrong way” (replacing governments). But my point above is that this sort of claim is largely irrelevant to defense mechanisms against power-seeking.
(Now, in this case, MIRI was pursuing a type of power that was too weird to trigger many defense mechanisms, though it did trigger some “this is a cult” defense mechanisms. But the point cross-applies to other types of power that they, and others in AI safety, are pursuing.)
I don’t super buy this. I don’t think MIRI was trying to accumulate a lot of power. In my model of the world they were trying to design a blueprint for some institution or project that would mostly have highly conditional power, that they would personally not wield.
In the metaphor of classical governance, I think what MIRI was doing was much more “design a blueprint for a governance agency” not “put themselves in charge of a governance agency”. Designing a blueprint is not a particularly power-seeking move, especially if you expect other people to implement it.
I got your point and think it’s valid and I don’t object to calling MIRI structurally power-seeking to the extent they wanted to execute a pivotal act themselves (Habryka claims they weren’t, I’m not knowledgeable on that front).
I still think it’s important to push back against a false claim that someone had the goal of taking over the world.