interstice comments on Towards more cooperative AI safety strategies

interstice 19 Jul 2024 2:23 UTC
7 points
−4
“Taking over” something does not imply that you are going to use your authority in a tyrannical fashion. People can obtain control over organizations and places and govern with a light or even barely-existent touch, it happens all the time.

Would you accept “they plan to use extremely powerful AI to institute a minimalist, AI-enabled world government focused on preventing the development of other AI systems” as a summary? Like sure, “they want to take over the world” as a gist of that does have a bit of an editorial slant, but not that much of one. I think that my original comment would be perceived as much less misleading by the majority of the world’s population than “they just want to do some helpful math uwu” in the event that these plans actually succeeded. I also think it’s obvious that these plans indicate a far higher degree of power-seeking(in aim at least) than virtually all other charitable organizations.

(..and to reiterate, I’m not taking a strong stance on the advisability of these plans. In a way, had they succeeded, that would have provided a strong justification for their necessity. I just think it’s absurd to say that the organization making them is less power-seeking than the ADL or whatever)
- Ruby 19 Jul 2024 3:33 UTC
  8 points
  6
  Parent
  
  Would you accept “they plan to use extremely powerful AI to institute a minimalist, AI-enabled world government focused on preventing the development of other AI systems” as a summary?
  
  No. Because I don’t think that was specified or is necessary for a pivotal act. You could leave all existing government structures intact and simply create an invincible system that causes any GPU farm larger than a certain size to melt. Or something akin to that that doesn’t require replacing existing governments, but is a quite narrow intervention.
  - interstice 19 Jul 2024 3:48 UTC
    5 points
    −2
    Parent
    It wasn’t specified but I think they strongly implied it would be that or something equivalently coercive. The “melting GPUs” plan was explicitly not a pivotal act but rather something with the required level of difficulty, and it was implied that the actual pivotal act would be something further outside the political Overton window. When you consider the ways “melting GPUs” would be insufficient a plan like this is the natural conclusion.
    
    doesn’t require replacing existing governments
    
    I don’t think you would need to replace existing governments. Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy. Get existing governments to help you, or at least not interfere, via some mix of coercion and trade. Sort of a feudal arrangement with a minimalist central power.
    - Ruby 19 Jul 2024 4:31 UTC
      13 points
      10
      Parent
      
      Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy.
      
      That to me is a very very non-central case of “take over the world”, if it is one at all.
      
      This is about “what would people think when they hear that description” and I could be wrong, but I expect “the plan is to take over the world” summary would lead people to expect “replace governments” level of interference, not “coerce/trade to ensure this specific policy”—and there’s a really really big difference between the two.
      - Richard_Ngo 19 Jul 2024 18:37 UTC
        25 points
        9
        Parent
        I think this whole debate is missing the point I was trying to make. My claim was that it’s often useful to classify actions which tend to lead you to having a lot of power as “structural power-seeking” regardless of what your motivations for those actions are. Because it’s very hard to credibly signal that you’re accumulating power for the right reasons, and so the defense mechanisms will apply to you either way.
        In this case MIRI was trying to accumulate a lot of power, and claiming that they were aiming to use it in the “right way” (do a pivotal act) rather than the “wrong way” (replacing governments). But my point above is that this sort of claim is largely irrelevant to defense mechanisms against power-seeking.
        (Now, in this case, MIRI was pursuing a type of power that was too weird to trigger many defense mechanisms, though it did trigger some “this is a cult” defense mechanisms. But the point cross-applies to other types of power that they, and others in AI safety, are pursuing.)
        habryka 19 Jul 2024 18:43 UTC
        7 points
        −14
        Parent
        I don’t super buy this. I don’t think MIRI was trying to accumulate a lot of power. In my model of the world they were trying to design a blueprint for some institution or project that would mostly have highly conditional power, that they would personally not wield.
        In the metaphor of classical governance, I think what MIRI was doing was much more “design a blueprint for a governance agency” not “put themselves in charge of a governance agency”. Designing a blueprint is not a particularly power-seeking move, especially if you expect other people to implement it.
        [ ]
        [deleted]
        Ruby 19 Jul 2024 19:54 UTC
        5 points
        3
        Parent
        I got your point and think it’s valid and I don’t object to calling MIRI structurally power-seeking to the extent they wanted to execute a pivotal act themselves (Habryka claims they weren’t, I’m not knowledgeable on that front).
        
        I still think it’s important to push back against a false claim that someone had the goal of taking over the world.