johnswentworth comments on “Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

johnswentworth 20 Apr 2022 0:56 UTC
LW: 45 AF: 19
0
AF
+1 to the distinction between “Regulating AI is possible/impossible” vs “pivotal act framing is harmful/unharmful”.
I’m sympathetic to a view that says something like “yeah, regulating AI is Hard, but it’s also necessary because a unilateral pivotal act would be Bad”. (TBC, I’m not saying I agree with that view, but it’s at least coherent and not obviously incompatible with how the world actually works.) To properly make that case, one has to argue some combination of:
- A unilateral pivotal act would be so bad that it’s worth accepting a much higher chance of human extinction in order to avoid it, OR
- Aiming for a unilateral pivotal act would not reduce the chance of human extinction much more than aiming for a multilateral pivotal act
I generally expect people opposed to the pivotal act framing to have the latter in mind rather than the former. The obvious argument that aiming for a unilateral pivotal act does reduce the chance of human extinction much more than aiming for a multilateral pivotal act is that it’s much more likely that someone could actually perform a unilateral pivotal act; it is a far easier problem, even after accounting for the problems the OP mentions in Part 1. That, I think, is the main view one would need to argue against in order to make the case for multilateral over unilateral pivotal act as a goal. The OP doesn’t really make that case at all; it argues that aiming for unilateral introduces various challenges, but it doesn’t even attempt to argue that those challenges would be harder than (or even comparably hard to) getting all the major world governments to jointly implement an actually-effective pivotal act.
- Andrew_Critch 8 Jun 2022 18:43 UTC
  LW: 19 AF: 8
  1
  AF Parent
  John, it seems like you’re continuing to make the mistake-according-to-me of analyzing the consequences of a pivotal act without regard for the consequences of the intentions leading up to the act. The act can’t come out of a vacuum, and you can’t built a project compatible with the kind of invasive pivotal acts I’m complaining about without causing a lot of problems leading up to the act, including triggering a lot of fear and panic for other labs and institutions. To summarize from the post title: pivotal act intentions directly have negative consequences fox x-safety, and people thinking about the acts alone seem to be ignoring the consequences of the intentions leading up to the act, which is a fallacy.
  - johnswentworth 8 Jun 2022 20:18 UTC
    LW: 13 AF: 5
    5
    AF Parent
    I see the argument you’re making there. I still think my point stands: the strategically relevant question is not whether unilateral pivotal act intentions will cause problems, the question is whether aiming for a unilateral pivotal act would or would not reduce the chance of human extinction much more than aiming for a multilateral pivotal act. The OP does not actually attempt to compare the two, it just lists some problems with aiming for a unilateral pivotal act.
    I do think that aiming for a unilateral act increases the chance of successfully executing the pivotal act by multiple orders of magnitude, even accounting for the part where other players react to the intention, and that completely swamps the other considerations.
  - Ben Pace 8 Jun 2022 19:44 UTC
    LW: 4 AF: 2
    0
    AF Parent
    Just as a related idea, in my mind, I often do a kind of thinking that HPMOR!Harry would call “Hufflepuff Bones”, where I look for ways a problem is solvable in physical reality at all, before considering ethical and coordination and even much in the way of practical concerns.
- interstice 20 Apr 2022 1:31 UTC
  LW: 11 AF: 5
  2
  AF Parent
  
  it’s much more likely that someone could actually perform a unilateral pivotal act; it is a far easier problem, even after accounting for the problems the OP mentions in Part 1.
  
  What I’ve never understood about the pivotal act plan is exactly what the successful AGI team is supposed to do after melting the GPUs or whatever. Every government on Earth will now consider them their enemy; they will immediately be destroyed unless they can defend themselves militarily, then countries will simply rebuild the GPU factories and continue on as before(except now in a more combative, disrupted, AI-race-encouraging geopolitical situation). So any pivotal act seems to require, at a minimum, an AI capable of militarily defeating all countries’ militaries. Then in order to not have society collapse, you probably need to become the government yourself, or take over or persuade existing governments to go along with your agenda. But an AGI that would be capable of doing all this safely seems...not much easier to create than a full-on FAI? It’s not like you could get by with an AI that was freakishly skilled at designing nanomachines but nothing else, you’d need something much more general. But isn’t the whole idea of the pivotal act plan that you don’t need to solve alignment in full generality to execute a pivotal act? For these reasons, executing a unilateral pivotal act(that actually results in an x-risk reduction) does not seem obviously easier than convincing governments to me.
  - johnswentworth 20 Apr 2022 1:49 UTC
    LW: 17 AF: 7
    2
    AF Parent
    Oh, melting the GPUs would not actually be a pivotal act. There would need to be some way to prevent new GPUs from being built in order for it to be a pivotal act.
    Military capability is not strictly necessary; a pivotal act need not necessarily piss off world governments. AGI-driven propaganda, for instance, might avoid that.
    Alternatively, an AGI could produce nanomachines which destroy GPUs, are extremely hard to eradicate, but otherwise don’t do much of anything.
    (Note that these aren’t intended to be very good/realistic suggestions, they’re just meant to point to different dimensions of the possibility space.)
    - interstice 20 Apr 2022 2:40 UTC
      LW: 9 AF: 4
      2
      AF Parent
      
      Oh, melting the GPUs would not actually be a pivotal act
      
      Well yeah, that’s my point. It seems to me that any pivotal act worthy of the name would essentially require the AI team to become an AGI-powered world government, which seems pretty darn difficult to pull off safely. The superpowered-AI-propaganda plan falls under this category. The long-lasting nanomachines idea is cute, but I bet people would just figure out ways to evade the nanomachines’ definition of ‘GPU’.
      
      Note that these aren’t intended to be very good/realistic suggestions, they’re just meant to point to different dimensions of the possibility space
      
      Fair enough...but if the pivotal act plan is workable, there should be some member of that space which actually is good/seems like it has a shot of working out in reality(and which wouldn’t require a full FAI). I’ve never heard any and am having a hard time thinking of one. Now it could be that MIRI or others think they have a workable plan which they don’t want to share the details of due to infohazard concerns. But as an outside observer, I have to assign a certain amount of probability to that being self-delusion.
      - Raemon 20 Apr 2022 2:47 UTC
        LW: 6 AF: 2
        0
        AF Parent
        Well yeah, that’s my point. It seems to me that any pivotal act worthy of the name would essentially require the AI team to become an AGI-powered world government, which seems pretty darn difficult to pull off safely. The superpowered-AI-propaganda plan falls under this category.
        Yeah. I think this sort of thing is why Eliezer thinks we’re doomed – getting the humanity to coordinate collectively seems doomed (i.e. see Gain of Function Research), and there are no weak pivotal acts that aren’t basically impossible to execute safely.
        The nanomachine gpu-melting pivotal act is meant to be a gesture at the difficulty / power level, not an actual working example. The other gestured-example I’ve heard is “upload aligned people who think hard for 1000 subjective years and hopefully figure something out.” I’ve heard someone from MIRI argue that one is also unworkable but wasn’t sure on the exact reasons.
        johnswentworth 20 Apr 2022 3:23 UTC
        7 points
        0
        Parent
        The other gestured-example I’ve heard is “upload aligned people who think hard for 1000 subjective years and hopefully figure something out.” I’ve heard someone from MIRI argue that one is also unworkable but wasn’t sure on the exact reasons.
        Standard counterargument to that one is “by the time we can do that we’ll already have beyond-human AI capabilities (since running humans is a lower bound on what AI can do), and therefore foom”.
        interstice 20 Apr 2022 3:38 UTC
        5 points
        Parent
        You could have another limited AI design a nanofactory to make ultra-fast computers to run the emulations. I think a more difficult problem is getting a limited AI to do neuroscience well. Actually I think this whole scenario is kind of silly, but given the implausible premise of a single AI lab having a massive tech lead over all others, neuroscience may be the bigger barrier.
        interstice 20 Apr 2022 3:05 UTC
        LW: 4 AF: 2
        1
        AF Parent
        
        Yeah. I think this sort of thing is why Eliezer thinks we’re doomed
        
        Hmm, interesting...but wasn’t he more optimistic a few years ago, when his plan was still “pull off a pivotal act with a limited AI”? I thought the thing that made him update towards doom was the apparent difficulty of safely making even a limited AI, plus shorter timelines.
        
        other gestured-example I’ve heard is “upload aligned people who think hard for 1000 subjective years and hopefully figure something out.”
        
        Ah, that actually seems like it might work. I guess the problem is that an AI that can competently do neuroscience well enough to do this would have to be pretty general. Maybe a more realistic plan along the same lines might be to try using ML to replicate the functional activity of various parts of the human brain and create ‘pseudo-uploads’. Or just try to create an AI with similar architecture and roughly-similar reward function to us, hoping that human values are more generic than they might appear.
        Adele Lopez 20 Apr 2022 5:14 UTC
        LW: 9 AF: 5
        AF Parent
        It seems relatively plausible that you could use a Limited AGI to build a nanotech system capable of uploading a diverse assortment of (non-brain, or maybe only very small brains) living tissue without damaging them, and that this system would learn how to upload tissue in a general way. Then you could use the system (not the AGI) to upload humans (tested on increasingly complex animals). It would be a relatively inefficient emulation, but it doesn’t seem obviously doomed to me.
        
        Probably too late once hardware is available to do this though.