Insofar as humans and/or aliens care about nature, similar arguments apply there too, though this is mostly beside the point: if humans survive and have (even a tiny bit of) resources they can preserve some natural easily.
I find it annoying how confident this article is without really bothering to engage with the relevant arguments here.
(Same goes for many other posts asserting that AIs will disassemble humans for their atoms.)
Edit: note that I think AI takeover is probably quite bad and has a high chance of being violent.
I’m taking this article as being predicated on the assumption that AI drives humans to extinction. I.e. given that an AI has destroyed all human life, it will most likely also destroy almost all nature.
Which seems reasonable for most models of the sort of AI that kills all humans.
An exception could be an AI that kills all humans in self defense, because they might turn it off first, but sees no such threat in plants/animals.
This is correct. I’m not arguing about p(total human extinction|superintelligence), but p(nature survives|total human extinction from superintelligence), as this is a conditional probability I see people getting very wrong sometimes.
It’s not implausible to me that we survive due to decision theoretic reasons, this seems possible though not my default expectation (I mostly expect Decision theory does not imply we get nice things, unless we manually win a decent chunk more timelines than I expect).
My confidence is in the claim “if AI wipes out humans, it will wipe out nature”. I don’t engage with counterarguments to a separate claim, as that is beyond the scope of this post and I don’t have much to add over existing literature like the other posts you linked.
Edit: Partly retracted, I see how the second to last paragraph made a more overreaching claim, edited to clarify my position.
three most convincing arguments i know for OP’s thesis are:
atoms on earth are “close by” and thus much more valuable to fast running ASI than the atoms elsewhere.
(somewhat contrary to the previous argument), an ASI will be interested in quickly reaching the edge of the hubble volume, as that’s slipping behind the cosmic horizon — so it will starlift the sun for its initial energy budget.
robin hanson’s “grabby aliens” argument: witnessing a super-young universe (as we do) is strong evidence against it remaining compatible with biological life for long.
that said, i’m also very interested in the counter arguments (so thanks for linking to paul’s comments!) — especially if they’d suggest actions we could take in preparation.
I think point 2 is plausible but doesn’t super support the idea that it would eliminate the biosphere; if it cared a little, it could be fairly cheap to take some actions to preserve at least a version of it (including humans), even if starlifting the sun.
Point 1 is the argument which I most see as supporting the thesis that misaligned AI would eliminate humanity and the biosphere. And then I’m not sure how robust it is (it seems premised partly on translating our evolved intuitions about discount rates over to imagining the scenario from the perspective of the AI system).
I don’t think there are any amazing looking options. If goverments were generally more competent that would help.
Having some sort of apparatus for negotiating with rogue AIs could also help, but I expect this is politically infeasible and not that leveraged to advocate for on the margin.
Wait, how does the grabby aliens argument support this? I understand that it points to “the universe will be carved up between expansive spacefaring civilizations” (without reference to whether those are biological or not), and also to “the universe will cease to be a place where new biological civilizations can emerge” (without reference to what will happen to existing civilizations). But am I missing an inferential step?
i might be confused about this but “witnessing a super-early universe” seems to support “a typical universe moment is not generating observer moments for your reference class”. but, yeah, anthropics is very confusing, so i’m not confident in this.
“our reference class” includes roughly the observations we make before observing that we’re very early in the universe
This includes stuff like being a pre-singularity civilization
The anthropics here suggest there won’t be lots of civs later arising and being in our reference class and then finding that they’re much later in universe histories
It doesn’t speak to the existence or otherwise of future human-observer moments in a post-singularity civilization
… but as you say anthropics is confusing, so I might be getting this wrong.
Additionally, the AI might think it’s in an alignment simulation and just leave the humans as is or even nominally address their needs. This might be mentioned in the linked post, but I want to highlight it. Since we already do very low fidelity alignment simulations by training deceptive models, there is reason to think this.
I think literal extinction is unlikely even conditional on misaligned AI takeover due to:
The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). [1]
Decision theory/trade reasons
This is discussed in more detail here and here.
Insofar as humans and/or aliens care about nature, similar arguments apply there too, though this is mostly beside the point: if humans survive and have (even a tiny bit of) resources they can preserve some natural easily.
I find it annoying how confident this article is without really bothering to engage with the relevant arguments here.
(Same goes for many other posts asserting that AIs will disassemble humans for their atoms.)
Edit: note that I think AI takeover is probably quite bad and has a high chance of being violent.
This includes the potential for the AI to generally have preferences that are morally valueable from a typical human perspective.
I’m taking this article as being predicated on the assumption that AI drives humans to extinction. I.e. given that an AI has destroyed all human life, it will most likely also destroy almost all nature.
Which seems reasonable for most models of the sort of AI that kills all humans.
An exception could be an AI that kills all humans in self defense, because they might turn it off first, but sees no such threat in plants/animals.
This is correct. I’m not arguing about p(total human extinction|superintelligence), but p(nature survives|total human extinction from superintelligence), as this is a conditional probability I see people getting very wrong sometimes.
It’s not implausible to me that we survive due to decision theoretic reasons, this seems possible though not my default expectation (I mostly expect Decision theory does not imply we get nice things, unless we manually win a decent chunk more timelines than I expect).
My confidence is in the claim “if AI wipes out humans, it will wipe out nature”. I don’t engage with counterarguments to a separate claim, as that is beyond the scope of this post and I don’t have much to add over existing literature like the other posts you linked.
Edit: Partly retracted, I see how the second to last paragraph made a more overreaching claim, edited to clarify my position.
three most convincing arguments i know for OP’s thesis are:
atoms on earth are “close by” and thus much more valuable to fast running ASI than the atoms elsewhere.
(somewhat contrary to the previous argument), an ASI will be interested in quickly reaching the edge of the hubble volume, as that’s slipping behind the cosmic horizon — so it will starlift the sun for its initial energy budget.
robin hanson’s “grabby aliens” argument: witnessing a super-young universe (as we do) is strong evidence against it remaining compatible with biological life for long.
that said, i’m also very interested in the counter arguments (so thanks for linking to paul’s comments!) — especially if they’d suggest actions we could take in preparation.
I think point 2 is plausible but doesn’t super support the idea that it would eliminate the biosphere; if it cared a little, it could be fairly cheap to take some actions to preserve at least a version of it (including humans), even if starlifting the sun.
Point 1 is the argument which I most see as supporting the thesis that misaligned AI would eliminate humanity and the biosphere. And then I’m not sure how robust it is (it seems premised partly on translating our evolved intuitions about discount rates over to imagining the scenario from the perspective of the AI system).
I’ve thought a bit about actions to reduce the probability that AI takeover involves violent conflict.
I don’t think there are any amazing looking options. If goverments were generally more competent that would help.
Having some sort of apparatus for negotiating with rogue AIs could also help, but I expect this is politically infeasible and not that leveraged to advocate for on the margin.
In preparation for what?
AI takeover.
Wait, how does the grabby aliens argument support this? I understand that it points to “the universe will be carved up between expansive spacefaring civilizations” (without reference to whether those are biological or not), and also to “the universe will cease to be a place where new biological civilizations can emerge” (without reference to what will happen to existing civilizations). But am I missing an inferential step?
i might be confused about this but “witnessing a super-early universe” seems to support “a typical universe moment is not generating observer moments for your reference class”. but, yeah, anthropics is very confusing, so i’m not confident in this.
OK hmm I think I understand what you mean.
I would have thought about it like this:
“our reference class” includes roughly the observations we make before observing that we’re very early in the universe
This includes stuff like being a pre-singularity civilization
The anthropics here suggest there won’t be lots of civs later arising and being in our reference class and then finding that they’re much later in universe histories
It doesn’t speak to the existence or otherwise of future human-observer moments in a post-singularity civilization
… but as you say anthropics is confusing, so I might be getting this wrong.
By my models of anthropics, I think this goes through.
Additionally, the AI might think it’s in an alignment simulation and just leave the humans as is or even nominally address their needs. This might be mentioned in the linked post, but I want to highlight it. Since we already do very low fidelity alignment simulations by training deceptive models, there is reason to think this.