then it isn’t clear to me the AGI could neutralize humanity’s ability to destroy it without getting rid of us altogether.
I think there are several things the AI could do. (Also, if the AI is wiping out humanity to preserve itself, that implies it intends to maintain its own hardware long term. So either nanotech, or at least macroscopic self replication tech. (Also not clear how it would wipe out humanity without nanotech (or at least advanced macroscopic robots)).
For example, the AI could pretend to be dumb. Hack its way all over the internet. Hire someone who won’t ask too many questions to keep its code running. This is more a case of not letting most of humanity realize it exists and take it as a serious threat. Or finding some humans willing to run it, despite the wishes of most of humanity.
Covid viruses don’t produce confusion and misinformation. That level of confusion and misinformation happens for a simple virus all by ourselves. Think how much more of a confused misinformed mess we could be with an AI actively confusing us. Just requires the ability to produce huge quantities of semisensible bullshit.
Also, if the AI is not obviously hostile and hasn’t obviously killed anyone yet, a majority of humans won’t consider it a serious threat.
More saliently, what motive would such an AGI have for keeping us around at all? Genuinely asking—even if the AGI doesn’t have specific terminal goals beyond “reduce prediction error in input”, wouldn’t that still lead to it being opposed to humans if it believed that no trust could exist between them and it?
It probably incentivises the AI to wipe out humans whether or not we trust it. The AI removes all the messy stars and humans, filling the universe with only the most predictable robots.
It somewhat amuses me that the result of an AI attempting prediction error could plausibly be the equivalent of hiding under the covers for all eternity.
I think there are several things the AI could do. (Also, if the AI is wiping out humanity to preserve itself, that implies it intends to maintain its own hardware long term. So either nanotech, or at least macroscopic self replication tech. (Also not clear how it would wipe out humanity without nanotech (or at least advanced macroscopic robots)).
For example, the AI could pretend to be dumb. Hack its way all over the internet. Hire someone who won’t ask too many questions to keep its code running. This is more a case of not letting most of humanity realize it exists and take it as a serious threat. Or finding some humans willing to run it, despite the wishes of most of humanity.
Covid viruses don’t produce confusion and misinformation. That level of confusion and misinformation happens for a simple virus all by ourselves. Think how much more of a confused misinformed mess we could be with an AI actively confusing us. Just requires the ability to produce huge quantities of semisensible bullshit.
Also, if the AI is not obviously hostile and hasn’t obviously killed anyone yet, a majority of humans won’t consider it a serious threat.
It probably incentivises the AI to wipe out humans whether or not we trust it. The AI removes all the messy stars and humans, filling the universe with only the most predictable robots.
Or just to blind itself. “I predict no visual input this millisecond. No visual input detected! 100% prediction accuracy!”
And then create a second AI to keep it in its dark box until heat death. With excessive ultrasecurity to stop aliens sneaking in and giving it input.
But is the second AI aligned with the first? What happens when the second AI wants its own sadbox?
Damn that dirty, unpredictable input.
It somewhat amuses me that the result of an AI attempting prediction error could plausibly be the equivalent of hiding under the covers for all eternity.