JacobW38 comments on What does it mean for an AGI to be ‘safe’?

JacobW38 7 Oct 2022 5:39 UTC
−2 points
1
Are you telling me you’d be okay with releasing an AI that has a 25% chance of killing over a billion people, and a 50% chance of at least killling hundreds of millions? I have to be missing the point here, because this post isn’t doing anything to convince me that AI researchers aren’t Stalin on steroids.

Or are you saying that if one can get to that point, it’s much easier from there to get to the point of having an AI that will cause very few fatalities and is actually fit for practical use?
- Jon Garcia 7 Oct 2022 6:39 UTC
  13 points
  4
  Parent
  Rather, I think he means that alignment is such a narrow target, and the space of all possible minds is so vast, that the default outcome is that unaligned AGI becomes unaligned ASI and ends up killing all humans (or even all life) in pursuit of its unaligned objectives. Hitting anywhere close to the alignment target (such that there’s at least 50% chance of “only” one billion people dying) would be a big win by comparison.
  
  Of course, the actual goal is for “things [to] go great in the long run”, not just for us to avoid extinction. Alignment itself is the target, but safety is at least a consolation prize.
  
  So no, I don’t think Nate, Eliezer, or anyone else is okay with releasing an AI that would kill hundreds of millions of people. But AGI is coming, whether we want it or not, and it will not be aligned with human survival (much less human flourishing) by default.
  
  Eliezer tends to think that solving alignment is so much more difficult and so much less researched than raw AGI that doom is almost certain. I’m a bit more optimistic, but I agree that minimizing the probable magnitude of the doom is better than everyone dying.
  
  Or are you saying that if one can get to that point, it’s much easier from there to get to the point of having an AI that will cause very few fatalities and is actually fit for practical use?
  
  Also this.
  What links here?
  - Trajectories to 2036 by ukc10014 (20 Oct 2022 20:23 UTC; 3 points)
  - JacobW38 7 Oct 2022 6:49 UTC
    −3 points
    −10
    Parent
    Feels like Y2K: Electric Boogaloo to me. In any case, if a major catastrophe did come of the first attempt to release an AGI, I think the global response would be to shut it all down, taboo the entire subject, and never let it be raised as a possibility again.
    - Jon Garcia 7 Oct 2022 6:59 UTC
      9 points
      3
      Parent
      The tricky thing with human politics is that governments will still fund research into very dangerous technology if it has the potential to grant them a decisive advantage on the world stage.
      No one wants nuclear war, but everyone wants nukes, even (or especially) after their destructive potential has been demonstrated. No one wants AGI to destroy the world, but everyone will want an AGI that can outthink their enemies, even (or especially) after its power has been demonstrated.
      The goal, of course, is to figure out alignment before the first metaphorical (or literal) bomb goes off.
      - JacobW38 7 Oct 2022 8:18 UTC
        1 point
        −9
        Parent
        On that note, the main way I could envision AI being really destructive is getting access to a government’s nuclear arsenal. Otherwise, it’s extremely resourceful but still trapped in an electronic medium; the most it could do if it really wanted to cause damage is destroy the power grid (which would destroy it too).
        the gears to ascension 7 Oct 2022 9:11 UTC
        5 points
        1
        Parent
        you’re underestimating biology
- RobertM 7 Oct 2022 5:59 UTC
  5 points
  2
  Parent
  He’s saying the second.
- alexey 6 Nov 2022 20:25 UTC
  3 points
  0
  Parent
  It’s explicitly the second:
  But if they can do that with an AGI capable of ending the acute risk period, then they’ve probably solved most of the alignment problem. Meaning that it should be easy to drive the probability of disaster dramatically lower.