dr_s comments on AI Regulation May Be More Important Than AI Alignment For Existential Safety

dr_s 24 Aug 2023 15:49 UTC
11 points
−8
I think the “pivotal act” notion was always borderline insanity. First, it requires the kind of super intelligent, nigh-omnipotent and infallible ASI that only results from a proper aligned FOOM. We don’t even know if such a thing is possible (my money is on “no”). Second, just the idea of it is already fundamentally unacceptable.

“What is your lab doing?”
“Building AGI. We’re trying to align it to our interests and make sure it performs a pivotal act first thing when it comes online.”
“Uhm, a pivotal act? What’s that?”
“It means the AGI will do something big and powerful to stop any dangerous AGIs from being created ever again.”
”...wait, something? What will it do?”
“We don’t know, whatever it thinks is best and has the highest chance of working according to the values we are writing into it. Something. Hopefully we might not even notice what. Maybe we will. Might decide to destroy all computers I guess if it looks bad.”
“So you’re saying the AGI will do something, which might be something very disruptive, but you don’t know what that is? How do you even know that the AGI will not just… be wrong, and make a mess, or let a dangerous AGI rise up anyway?”
“Haha, well, it will surely not be as wrong as we would be. Anyway, mostly, we just hope it works out.”

That’s not really a viable policy to hold, and not something anyone (especially any politician) would endorse. Just giving the newborn AGI immense power over us, right off the bat, with no guarantee of how it will be used. The pivotal act as an idea IMO belongs in a long list of desperate attempts to reconcile the notion of an AGI existing with the notion of humans still surviving and experiencing a utopia, when that seems just less and less likely.
- habryka 25 Aug 2023 3:58 UTC
  8 points
  10
  Parent
  This is a pretty bad misunderstanding of the pivotal act concept and definitions.
  The whole reason why Eliezer talked about pivotal acts is in order to force people to be clear about how exactly they want to use an early AGI in order to end the acute risk period. This means it’s trying to explicitly contrast itself to other alignment approaches like OpenAI’s superalignment plan or Paul’s ELK proposals, which end up deferring to the AI on what the right way to actually end further AI risk is.
  - otto.barten 25 Aug 2023 7:26 UTC
    1 point
    0
    Parent
    Thanks Oliver for adding that context, that’s helpful.
  - dr_s 25 Aug 2023 8:21 UTC
    0 points
    −2
    Parent
    Why is it a misunderstanding, though? Eliezer has said multiple times that he doesn’t know what a good pivotal act would be. And we all know that Eliezer does believe in a fast take-off and huge gains for superintelligence, therefore I wouldn’t say it’s that weird for him to think that if we do have an aligned AGI it can quickly reach the point where it can just disseminate nanomachines that sabotage your GPUs if you’re trying to make another AGI or any other such thing.
    
    My point about it being insanity though was, besides the fact that I don’t agree with Eliezer on the credibility of those take-off scenarios (I think in fact sadly the ASI would stay much longer, possibly forever, in the “smart enough to kill us, not smart enough to save us” window), but the political feasibility and even morality of them. It’s still an incredibly unilateral act of essentially compulsion on all of humanity; for a very limited purpose, true, but still somehow aggressive (and in practice, I don’t think anyone would actually stop at just doing that, if they had that sort of power). I’m looking at this in terms of actual half realistic scenarios in which someone, not Yud himself, actually gets to build an AGI and gets to decide what values to put it in, and other people know they’re doing this, and so on so forth. And those worlds, IMO, don’t just let it happen, because right or wrong, most people and organizations don’t like the idea of someone making that sort of decision for them.
    - habryka 25 Aug 2023 19:37 UTC
      5 points
      0
      Parent
      Why is it a misunderstanding, though?
      I mean, because it asserts that the same people who advocate for thinking about pivotal acts, and who popularized the pivotal act notion would say anything like “We don’t know, whatever it thinks is best and has the highest chance of working according to the values we are writing into it.”.
      This is explicitly not what Nate and Eliezer and some MIRI people are trying to do. The whole point of a minimum pivotal act is to make it so that you don’t have to align your AI all the way so that you just have it go off and do whatever is best according to the values we programmed into it. It’s so that you have as close as possible to a concrete plan of what you want to do with the AI, planning for the world where you didn’t fully solve the AI Alignment problem and can just fully defer to the AI.
    - quetzal_rainbow 25 Aug 2023 20:27 UTC
      2 points
      1
      Parent
      I think, Eliezer said multiple times that he has pretty good idea about what is a minimal efficient pivotal act, he just can’t name it out loud because it’s way out of Overton window, so he keep referring to “something-like-melting-GPUs-but-obviously-not-that”?
      - dr_s 25 Aug 2023 20:39 UTC
        1 point
        0
        Parent
        Ok so he does admit it’s something completely politically unviable because it’s probably tyrannical or straight up lesser-evil-but-still-pretty-evil. At which point I’m not even sure if not saying it out loud doesn’t make it sound even more ominous. Point stands, “pivotal act” can’t possibly be a viable strategy and in fact its ethical soundness altogether is questionable unless it’s really just a forced binary choice between that and extinction.
        quetzal_rainbow 25 Aug 2023 21:10 UTC
        3 points
        2
        Parent
        “Outside Overton window” $\neq$ “evil”. Like, “let’s defer to prediction markets in major policy choices” was pretty out of it most of history and probably even today.
        As far as I remember, “melting all GPUs” is not an actual pivotal act because it is not minimal: it’s too hard to align ASI to build nanobots for this and operate in environment safely. And I think we can conclude that actual PA should be pretty tame, because, sure, melting all GPUs is scary and major property destruction, but it’s nothing close to “establishing mind-controlling surveillance dictatorship”.
        Another example of possible PA is invention of superhuman intelligence enhancement, but it’s still not minimal.
        dr_s 25 Aug 2023 22:15 UTC
        2 points
        0
        Parent
        
        “Outside Overton window”≠ “evil”
        
        True, but would you really be ashamed of saying “let’s defer to prediction markets in major policy choices” out loud? That might get you some laughs and wouldn’t be taken very seriously but most people wouldn’t be outright outraged.
        
        And I think we can conclude that actual PA should be pretty tame, because, sure, melting all GPUs is scary and major property destruction, but it’s nothing close to “establishing mind-controlling surveillance dictatorship”.
        
        True to a point, but it’s still something that people would strongly object to—since you can’t even prove the counterfactual that without it we’d be all dead. And in addition, there is a more serious aspect to it, which is military hardware that uses GPUs. And technically destroying that in other countries is an act of war or at least sabotage.
        
        Another example of possible PA is invention of superhuman intelligence enhancement, but it’s still not minimal.
        
        I doubt that would solve anything. Intelligence does not equal wisdom; some fool would still probably just use it to build AGI faster.
        quetzal_rainbow 26 Aug 2023 8:51 UTC
        2 points
        0
        Parent
        If you can prove that it’s you who melt all GPUs stealthy using AI-developed nanotech, it should be pretty obvious that the same AI without safety measures can kill everyone.
        Scott Alexander once wrote that while it’s probably not wise to build AI organisation around pivotal act, if you find yourself in position where you can do it, you should do it, because, assuming you are not special genius decades ahead in AI development, if you can do pivotal act, someone else in AI can kill everyone.
        I mean intelligence in wide sense, including wisdom, security mindset and self-control. And obviously, if I could build AI that can provide me such enhancement, I would enhance myself to solve full value-alignment problem, not give enhancement to random unchecked fools.
        dr_s 26 Aug 2023 11:36 UTC
        2 points
        0
        Parent
        Yes, but that “I can’t let someone else handle this, I’ll do it myself behind their backs” generalized attitude is how actually we do get 100% all offed, no pivotal acts whatsoever. It’s delusion to think it leaves a measurable, non-infinitesimal window to actually succeeding—it does not. It simply leads to everyone racing and eventually someone who’s more reckless and thus faster “winning”. Or at best, it leads to a pivotal act by someone who then absolutely goes on to abuse their newfound power because no one can be inherently trusted with that level of control. That’s the best of the two worlds, but still bad.
        quetzal_rainbow 26 Aug 2023 13:40 UTC
        1 point
        0
        Parent
        Not quite. If you live in the world where you can let others handle this, you can’t be in position to perform pivotal act, because others will successfully coordinate around not giving anyone (including you) unilateral capability to launch ASI. And otherwise, if you find yourself in situation “there is a red button to melt all GPUs”, it means that others utterly failed to coordinate and you should pick the least bad world that remains possible.
    - Steven Byrnes 25 Aug 2023 13:34 UTC
      2 points
      0
      Parent
      
      Eliezer has said multiple times that he doesn’t know what a good pivotal act would be.
      
      I don’t think that’s true; can you find an example?
      
      Eliezer has not publicly endorsed a specific concrete pivotal act, AFAIK, but that’s different.
      - dr_s 25 Aug 2023 13:43 UTC
        2 points
        0
        Parent
        Ah, fair, I might be mixing the two things. But let’s put it this way—if “melt all GPUs” is the pivotal act example Eliezer keeps going back to, and he has a secret one that he knows but doesn’t say out loud, is it because it’s some kind of infohazard that risks failing if spelled out, or is it because it’s so bad he knows it’s better if he doesn’t say it?
- otto.barten 24 Aug 2023 15:54 UTC
  8 points
  4
  Parent
  I don’t disagree. But I do think people dismissing the pivotal act should come up with an alternative plan that they believe is more likely to work. Because the problem is still there: “how can we make sure that no-one, ever builds an unaligned superintelligence?” My alternative plan is regulation.
  - dr_s 24 Aug 2023 15:59 UTC
    4 points
    −4
    Parent
    Oh, yes, I agree. Honestly I find it all bleak. The kind of regulation needed to prevent this sounds like may be either insufficient, or quite stifling. This is like having to deal with nuclear proliferation, but if the laws of nature allowed everyone to make an atomic pipe bomb by using rocks you can pick up from the ground. So it’s either that, or risking it and hoping that somehow it all turns out well—there are possible vulnerabilities in AI risk arguments, but personally I don’t find them all that compelling, or something I’d be willing to bet my life on, let alone everyone’s. I just think that the AI risk discourse is quite weighed down by the fact that many people just don’t want to let the hope of seeing the singularity in their lifetimes go, which prevents them from going all the way to the logical conclusion that we just shouldn’t build AGI, and find ways to prevent it all around, hard as it is.
    - Noosphere89 24 Aug 2023 17:49 UTC
      11 points
      2
      Parent
      
      This is like having to deal with nuclear proliferation, but if the laws of nature allowed everyone to make an atomic pipe bomb by using rocks you can pick up from the ground.
      
      This is hiding a lot of work, and if it’s interpreted as the most extreme statement possible, I think this is at best maybe true, and maybe simply false.
      
      And even if it is true, it’s not going to be exploited immediately, and there will be lag time that matters.
      
      Also importantly, LLMs probably aren’t going to scale to existential risk quickly unless our world is extremely vulnerable, due to pretty big issues with how they reason, so that adds additional lag time.
      
      A basic disagreement I have with this post and many rationalist worldviews, including you dr_s’s worldview here is that I believe this statement in this post is either simply false, true but has more limitations than rationalists think, or it’s true but it takes a lot longer to materialize than people here think, which is important since we can probably regulate things pretty well as long as the threat isn’t too fast coming:
      
      My personal bet, however, is that offense will unfortunately trump defense.
      - dr_s 25 Aug 2023 8:39 UTC
        3 points
        0
        Parent
        Ok, so I may have come off too pessimistic there. Realistically I don’t think AGI will actually be something you can achieve on your gaming laptop in a few days of training just yet, or any time soon. So maybe my metaphor should have been different, but it’s hard to give the right sense of scale. The Manhattan project required quite literally all of the industrial might of the US. This is definitely smaller, though perhaps not do-it-in-my-basement smaller. I do generally agree that there are things we can do—and at the very least they’re worth trying! That said, I still think that even the things that work are kind of too restrictive for my tastes, and I’m also worried that of course as it always happen they’ll lead to politicians overreaching. My ideal world would be one in which big AI labs get stifled on creating AGI specifically, specialised AI is left untouched, open source software for lesser applications is left untouched, and maybe we only monitor large scale GPU hoarding. But I doubt it’d be that simple. So that’s what I find bleak—that we’re forced into a choice between risk of extinction or risk of oppression whereas we wouldn’t have to if people didn’t insist trying to open this specific Pandora’s box.
        Noosphere89 25 Aug 2023 13:55 UTC
        2 points
        0
        Parent
        That’s definitely progress. I think that the best thing AI regulation can do right now is looking to the future, and in particular getting prepared with draft plans for AI regulation, so that if or when the next crisis hits, we won’t be fumbling for solutions and instead have good AI regulations back in the running.
        otto.barten 27 Aug 2023 13:35 UTC
        1 point
        0
        Parent
        Agree that those drafts are very important. I also think there will be technical research required in order to find out which regulation would actually be sufficient (I think at present we have no idea). I disagree, however, that waiting for a crisis (warning shot) is a good plan. There might not really be one. If there would be one, though, I agree that we should at least be ready.
        Noosphere89 27 Aug 2023 14:43 UTC
        2 points
        0
        Parent
        True that we probably shouldn’t wait for a crisis, but one thing that does stand out to me is that the biggest issue wasn’t political will, but rather that AI governance was pretty unpreprared for this moment (though they improvised surprisingly effectively).
- Kaarel 25 Aug 2023 0:34 UTC
  3 points
  0
  Parent
  In this comment, I will be assuming that you intended to talk of “pivotal acts” in the standard (distribution of) sense(s) people use the term — if your comment is better described as using a different definition of “pivotal act”, including when “pivotal act” is used by the people in the dialogue you present, then my present comment applies less.
  
  I think that this is a significant mischaracterization of what most (? or definitely at least a substantial fraction of) pivotal activists mean by “pivotal act” (in particular, I think this is a significant mischaracterization of what Yudkowsky has in mind). (I think the original post also uses the term “pivotal act” in a somewhat non-standard way in a similar direction, but to a much lesser degree.) Specifically, I think it is false that the primary kinds of plans this fraction of people have in mind when talking about pivotal acts involve creating a superintelligent nigh-omnipotent infallible FOOMed properly aligned ASI. Instead, the kind of person I have in mind is very interested in coming up with pivotal acts that do not use a general superintelligence, often looking for pivotal acts that use a narrow superintelligence (for instance, a narrow nanoengineer) (though this is also often considered very difficult by such people (which is one of the reasons they’re often so doomy)). See, for instance, the discussion of pivotal acts in https://www.lesswrong.com/posts/7im8at9PmhbT4JHsW/ngo-and-yudkowsky-on-alignment-difficulty.
  - dr_s 25 Aug 2023 8:31 UTC
    5 points
    2
    Parent
    I don’t think this revolutionises my argument. First, there’s a lot of talking about example possible pivotal acts and they’re mostly just not that believable on their own. The typical “melt all GPUs” is obviously incredibly hostile and disruptive, but yes, of course, it’s only an example. The problem is that without an actual outline for what a perfect pivotal act is, you can’t even hope to do it with “just” a narrow superintelligence, because in that case, you need to work out the details yourself, and the details are likely horribly complicated.
    
    But the core, fundamental problem with the “pivotal act” notion is that it tries to turn a political problem into a technological one. “Do not build AGIs” is fundamentally a political problem: it’s about restricting human freedom. Now you can either do that voluntarily, by consensus, with some enforcement mechanism for the majority to impose its will on the minority, or you can do that by force, with a minority using overwhelming power to make the majority go along even against their will. That’s it. A pivotal act is just a nice name for the latter thing. The essence of the notion is “we can’t get everyone on board quickly enough; therefore, we should just build some kind of superweapon that allows us to stop everyone else from building unsafe AGI as we define it, whether they like it or not”. It’s not a lethal weapon, and you can argue the utilitarian trade-off from your viewpoint is quite good, but it is undeniably a weapon. And therefore it’s just not something that can be politically acceptable because people don’t like to have weapons pointed at them, not even when the person making the weapon assures them it’s for their own good. If “pivotal act” became the main paradigm the race dynamics would only intensify because then everyone knows they’ll only have one shot and they won’t trust the others to either get it right or actually limit themselves to just the pivotal act once they’re the only ones with AI power in the world. And if instead the world came together to agree on a pivotal act… well that’s just regulation, first, as described in this post. And then moving on to develop a kind of special nanobot police to enforce that regulation (which would still be a highly controversial action, and if deployed worldwide, essentially an act of war against any country not subscribing to the AI safety treaty or whatever).
    - Kaarel 25 Aug 2023 14:31 UTC
      2 points
      0
      Parent
      I was just claiming that your description of pivotal acts / of people that support pivotal acts was incorrect in a way that people that think pivotal acts are worth considering would consider very significant and in a way that significantly reduces the power of your argument as applying to what people mean by pivotal acts — I don’t see anything in your comment as a response to that claim. I would like it to be a separate discussion whether pivotal acts are a good idea with this in mind.
      
      Now, in this separate discussion: I agree that executing a pivotal act with just a narrow, safe, superintelligence is a difficult problem. That said, all paths to a state of safety from AGI that I can think of seem to contain difficult steps, so I think a more fine-grained analysis of the difficulty of various steps would be needed. I broadly agree with your description of the political character of pivotal acts, but I disagree with what you claim about associated race dynamics — it seems plausible to me that if pivotal acts became the main paradigm, then we’d have a world in which a majority of relevant people are willing to cooperate / do not want to race that much against others in the majority, and it’d mostly be a race between this group and e/acc types. I would also add, though, that the kinds of governance solutions/mechanisms I can think of that are sufficient to (for instance) make it impossible to perform distributed training runs on consumer devices also seem quite authoritarian.
      - dr_s 25 Aug 2023 17:07 UTC
        2 points
        0
        Parent
        
        it seems plausible to me that if pivotal acts became the main paradigm, then we’d have a world in which a majority of relevant people are willing to cooperate / do not want to race that much against others in the majority, and it’d mostly be a race between this group and e/acc types
        
        I disagree, I think in many ways the current race already seems motivated by something of the sort—“if I don’t get to it first, they will, and they’re sure to fuck it up”. Though with no apparent planning for pivotal acts in sight (but who knows).
        
        I would also add, though, that the kinds of governance solutions/mechanisms I can think of that are sufficient to (for instance) make it impossible to perform distributed training runs on consumer devices also seem quite authoritarian.
        
        Oh, agreed. It’s a choice between shitty options all around.