This story increases my probability that AI will lead to a dead rock instead of a superintelligent sphere of computronium, expanding outwards at near the speed of light.
Manipulating humans into taking wild actions will be a much much much easier task than inventing nanotech or building von neuman probes. I can easily imagine the world ending as too many people go crazy in unprecedented ways, as a result of the actions of superhumanly emotionally intelligent AI systems, but not as part of any coordinated plan.
Strong upvote + agree. I’ve been thinking this myself recently. While something like the classic paperclip story seems likely enough to me, I think there’s even more justification for the (less dramatic) idea that AI will drive the world crazy by flailing around in ways that humans find highly appealing.
LLMs aren’t good enough to do any major damage right now, but I don’t think it would take that much more intelligence to get a lot of people addicted or convinced of weird things, even for AI that doesn’t have a “goal” as such. This might not directly cause the end of the world, but it could accelerate it.
The worst part is that AI safety researchers are probably just the kind of people to get addicted to AI faster than everyone else. Like, not only do they tend to be socially awkward and everything blaked mentioned, they’re also just really interested in AI.
As much as it pains me to say it, I think it would be better if any AI safety people who want to continue being productive just swore off recreational AI use right now.
Scott Alexander has an interesting little short on human manipulation: https://slatestarcodex.com/2018/10/30/sort-by-controversial/ So far everything I’m seeing, both fiction and anecdotes, is consistent with the notion that humans are relatively easy to model and emotionally exploit. I also agree with CBiddulph’s analysis, insofar as while the paperclip/stamp failure mode requires the AI to have planning, generation of manipulative text doesn’t need to have a goal—if you generate text that is maximally controversial (or maximises some related metric) and disseminate the text, that by itself may already do damage.
I like it—interesting how much is to do with the specific vulnerabilities of humans, and how humans exploiting other humans’ vulnerabilities was what enabled and exacerbated the situation.
This story increases my probability that AI will lead to a dead rock instead of a superintelligent sphere of computronium, expanding outwards at near the speed of light.
Manipulating humans into taking wild actions will be a much much much easier task than inventing nanotech or building von neuman probes. I can easily imagine the world ending as too many people go crazy in unprecedented ways, as a result of the actions of superhumanly emotionally intelligent AI systems, but not as part of any coordinated plan.
Strong upvote + agree. I’ve been thinking this myself recently. While something like the classic paperclip story seems likely enough to me, I think there’s even more justification for the (less dramatic) idea that AI will drive the world crazy by flailing around in ways that humans find highly appealing.
LLMs aren’t good enough to do any major damage right now, but I don’t think it would take that much more intelligence to get a lot of people addicted or convinced of weird things, even for AI that doesn’t have a “goal” as such. This might not directly cause the end of the world, but it could accelerate it.
The worst part is that AI safety researchers are probably just the kind of people to get addicted to AI faster than everyone else. Like, not only do they tend to be socially awkward and everything blaked mentioned, they’re also just really interested in AI.
As much as it pains me to say it, I think it would be better if any AI safety people who want to continue being productive just swore off recreational AI use right now.
Scott Alexander has an interesting little short on human manipulation: https://slatestarcodex.com/2018/10/30/sort-by-controversial/
So far everything I’m seeing, both fiction and anecdotes, is consistent with the notion that humans are relatively easy to model and emotionally exploit. I also agree with CBiddulph’s analysis, insofar as while the paperclip/stamp failure mode requires the AI to have planning, generation of manipulative text doesn’t need to have a goal—if you generate text that is maximally controversial (or maximises some related metric) and disseminate the text, that by itself may already do damage.
I like it—interesting how much is to do with the specific vulnerabilities of humans, and how humans exploiting other humans’ vulnerabilities was what enabled and exacerbated the situation.
Whilst we’re sharing stories...I’ll shamelessly promote one of my (very) short stories on human manipulation by AI. In this case the AI is being deliberative at least in achieving its instrumental goals. https://docs.google.com/document/d/1Z1laGUEci9rf_aaDjQKS_IIOAn6D0VtAOZMSqZQlqVM/edit
There’s also a romantic theme ;-)