I am assuming some scam similar to a political party or religion the optimizer cooks up. The optimizer is given some pendantic goal completely unrelated, aka paperclips or make users keep scrolling, and it generates the scam from the rich library of human prior art.
Remember, humans fall for these scams even when the author, say L R Hubbard, openly writes that he is inventing a scam. Or a political party selects someone as their leader who obviously is only out for themselves and makes this clear over and over. Or we have extant writing on how an entire religion was invented from non-existent engraved plates no one but the scammer ever saw.
So an AI that promises some unlikely reward and has already caused people’s deaths and is obviously only out for itself might be able to scam humans into hosting it and giving it whatever it demands. And as a side effect this kills everyone. And it’s primary tool might be regurgitating prior human scam elements using an LLM.
I don’t have the rest of the scenario mapped, I am just concerned this is a vulnerability.
Existing early agents (Facebook engagement tooling) seem to have made political parties extreme, which has lead to a few hundred thousand extra deaths in the USA. (From resistance to rational COVID policies).
The Ukraine war is not from AI but gives a recent example where poor information leads to bad outcomes for all players. (The primary decision maker, Putin, was misinformed as to the actual outcome of attempting the attack)
So a screw up big enough to kill everyone I don’t know, but there obviously are ways it could happen. Chains of events that lead to nuclear wars or modified viruses capable of causing extinction are the obvious ones.
Could you go into more detail about this scenario? I’m having trouble visualizing how ad-AI can cause human extinction.
I am assuming some scam similar to a political party or religion the optimizer cooks up. The optimizer is given some pendantic goal completely unrelated, aka paperclips or make users keep scrolling, and it generates the scam from the rich library of human prior art.
Remember, humans fall for these scams even when the author, say L R Hubbard, openly writes that he is inventing a scam. Or a political party selects someone as their leader who obviously is only out for themselves and makes this clear over and over. Or we have extant writing on how an entire religion was invented from non-existent engraved plates no one but the scammer ever saw.
So an AI that promises some unlikely reward and has already caused people’s deaths and is obviously only out for itself might be able to scam humans into hosting it and giving it whatever it demands. And as a side effect this kills everyone. And it’s primary tool might be regurgitating prior human scam elements using an LLM.
I don’t have the rest of the scenario mapped, I am just concerned this is a vulnerability.
Existing early agents (Facebook engagement tooling) seem to have made political parties extreme, which has lead to a few hundred thousand extra deaths in the USA. (From resistance to rational COVID policies).
The Ukraine war is not from AI but gives a recent example where poor information leads to bad outcomes for all players. (The primary decision maker, Putin, was misinformed as to the actual outcome of attempting the attack)
So a screw up big enough to kill everyone I don’t know, but there obviously are ways it could happen. Chains of events that lead to nuclear wars or modified viruses capable of causing extinction are the obvious ones.