ryan_greenblatt comments on “If we go extinct due to misaligned AI, at least nature will continue, right? … right?”

ryan_greenblatt 19 May 2024 0:28 UTC
30 points
−24
I think literal extinction is unlikely even conditional on misaligned AI takeover due to:
- The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). ^[1]
- Decision theory/trade reasons
This is discussed in more detail here and here.

Insofar as humans and/or aliens care about nature, similar arguments apply there too, though this is mostly beside the point: if humans survive and have (even a tiny bit of) resources they can preserve some natural easily.

I find it annoying how confident this article is without really bothering to engage with the relevant arguments here.

(Same goes for many other posts asserting that AIs will disassemble humans for their atoms.)

Edit: note that I think AI takeover is probably quite bad and has a high chance of being violent.
1. ↩︎
  This includes the potential for the AI to generally have preferences that are morally valueable from a typical human perspective.
What links here?
- ryan_greenblatt's comment on MIRI 2024 Communications Strategy by Gretta Duleba (29 May 2024 22:18 UTC; 11 points)
- Yair Halberstadt 19 May 2024 8:39 UTC
  15 points
  9
  Parent
  I’m taking this article as being predicated on the assumption that AI drives humans to extinction. I.e. given that an AI has destroyed all human life, it will most likely also destroy almost all nature.
  
  Which seems reasonable for most models of the sort of AI that kills all humans.
  
  An exception could be an AI that kills all humans in self defense, because they might turn it off first, but sees no such threat in plants/animals.
  - plex 20 May 2024 9:52 UTC
    7 points
    3
    Parent
    This is correct. I’m not arguing about p(total human extinction|superintelligence), but p(nature survives|total human extinction from superintelligence), as this is a conditional probability I see people getting very wrong sometimes.
    It’s not implausible to me that we survive due to decision theoretic reasons, this seems possible though not my default expectation (I mostly expect Decision theory does not imply we get nice things, unless we manually win a decent chunk more timelines than I expect).
    My confidence is in the claim “if AI wipes out humans, it will wipe out nature”. I don’t engage with counterarguments to a separate claim, as that is beyond the scope of this post and I don’t have much to add over existing literature like the other posts you linked.
    Edit: Partly retracted, I see how the second to last paragraph made a more overreaching claim, edited to clarify my position.
- jaan 19 May 2024 6:49 UTC
  12 points
  7
  Parent
  three most convincing arguments i know for OP’s thesis are:
  1. atoms on earth are “close by” and thus much more valuable to fast running ASI than the atoms elsewhere.
  2. (somewhat contrary to the previous argument), an ASI will be interested in quickly reaching the edge of the hubble volume, as that’s slipping behind the cosmic horizon — so it will starlift the sun for its initial energy budget.
  3. robin hanson’s “grabby aliens” argument: witnessing a super-young universe (as we do) is strong evidence against it remaining compatible with biological life for long.
  that said, i’m also very interested in the counter arguments (so thanks for linking to paul’s comments!) — especially if they’d suggest actions we could take in preparation.
  - owencb 19 May 2024 9:09 UTC
    6 points
    0
    Parent
    I think point 2 is plausible but doesn’t super support the idea that it would eliminate the biosphere; if it cared a little, it could be fairly cheap to take some actions to preserve at least a version of it (including humans), even if starlifting the sun.
    Point 1 is the argument which I most see as supporting the thesis that misaligned AI would eliminate humanity and the biosphere. And then I’m not sure how robust it is (it seems premised partly on translating our evolved intuitions about discount rates over to imagining the scenario from the perspective of the AI system).
  - ryan_greenblatt 20 May 2024 21:28 UTC
    4 points
    0
    Parent
    I’ve thought a bit about actions to reduce the probability that AI takeover involves violent conflict.
    
    I don’t think there are any amazing looking options. If goverments were generally more competent that would help.
    
    Having some sort of apparatus for negotiating with rogue AIs could also help, but I expect this is politically infeasible and not that leveraged to advocate for on the margin.
  - Mitchell_Porter 20 May 2024 12:41 UTC
    2 points
    0
    Parent
    actions we could take in preparation
    In preparation for what?
    - jaan 20 May 2024 15:47 UTC
      3 points
      0
      Parent
      AI takeover.
  - owencb 19 May 2024 8:23 UTC
    2 points
    0
    Parent
    Wait, how does the grabby aliens argument support this? I understand that it points to “the universe will be carved up between expansive spacefaring civilizations” (without reference to whether those are biological or not), and also to “the universe will cease to be a place where new biological civilizations can emerge” (without reference to what will happen to existing civilizations). But am I missing an inferential step?
    - jaan 20 May 2024 5:24 UTC
      3 points
      0
      Parent
      i might be confused about this but “witnessing a super-early universe” seems to support “a typical universe moment is not generating observer moments for your reference class”. but, yeah, anthropics is very confusing, so i’m not confident in this.
      - owencb 20 May 2024 12:20 UTC
        10 points
        0
        Parent
        OK hmm I think I understand what you mean.
        I would have thought about it like this:
        “our reference class” includes roughly the observations we make before observing that we’re very early in the universe
        This includes stuff like being a pre-singularity civilization
        The anthropics here suggest there won’t be lots of civs later arising and being in our reference class and then finding that they’re much later in universe histories
        It doesn’t speak to the existence or otherwise of future human-observer moments in a post-singularity civilization
        … but as you say anthropics is confusing, so I might be getting this wrong.
      - plex 20 May 2024 10:09 UTC
        2 points
        0
        Parent
        By my models of anthropics, I think this goes through.
- O O 19 May 2024 2:56 UTC
  3 points
  0
  Parent
  Additionally, the AI might think it’s in an alignment simulation and just leave the humans as is or even nominally address their needs. This might be mentioned in the linked post, but I want to highlight it. Since we already do very low fidelity alignment simulations by training deceptive models, there is reason to think this.