The two paragraph argument for AI risk
The very short version of the AI risk argument is that an AI that is *better than people at achieving arbitrary goals in the real world* would be a very scary thing, because whatever the AI tried to do would then actually happen. As stories of magically granted wishes and sci-fi dystopias point out, it’s really hard to specify a goal that can’t backfire, and current techniques for training neural networks are generally terrible at specifying goals precisely. If having a wish granted by a genie is dangerous, having a wish granted by a genie that can’t hear you clearly is even more dangerous.
Current AI systems certainly fall far short of being able to achieve arbitrary goals in the real world better than people, but there’s nothing in physics or mathematics that says such an AI is *impossible*, and progress in AI often takes people by surprise. People just don’t know what the actual time limit is, and unless we humans have a good plan *before* someone makes a scary AI that has a goal, things are going go very badly.
I’ve posted versions of this two-paragraph argument in various places online and used it in person, and it usually goes over pretty well; I think it explains pretty clearly and simply what the AI x-risk community is actually afraid of. I figured I’d post it here for everyone’s convenience.
- 14 Sep 2024 1:06 UTC; 2 points) 's comment on The Best Lay Argument is not a Simple English Yud Essay by (
- 2 Jun 2024 2:13 UTC; 0 points) 's comment on MIRI 2024 Communications Strategy by (
Okay, I’m gonna take my skeptical shot at the argument, I hope you don’t mind!
It’s not true that whatever the AI tried to do would happen. What if an AI wanted to travel faster than the speed of light, or prove that 2+2=5, or destroy the sun within 1 second of being turned on?
You can’t just say “arbitrary goals”, you have to actually explain what goals there are that would be realistically achievable by an realistic AI that could be actually built in the near future. If those abilities fall short of “destroy all of humanity”, then there is no x-risk.
This is fictional evidence. Genies don’t exist, and if they did, it probably wouldn’t be that hard to add enough caveats to your wish to prevent global genocide. A counterexample might be the use of laws: sure, there are loopholes, but not big enough that the law would let you off on a broad daylight killing spree.
Well, there is laws of physics and maths that put limits on available computational power, which in turn puts a limit on what an AI can actually achieve. For example, a perfect Bayesian reasoner is forbidden by the laws of mathematics.
Also, it’s an argument from selective stupidity. An ASI doesn’t have to interpret things literally as result of cognitive limitation.
Eh I think it seems somewhat easy for an AI to understand what our wish is at a common sense level, gpt4 can clearly understand to a degree. However, it’s yet to be proven if we can make them care about it (i.e. no deceptive alignment).
Part of the problem is that humans themselves are often bad at knowing what they want. :/
If anyone here has a better phrasing for something in the two paragraphs, feel free to let me know. I’m hoping for something that people can link to, copy/paste, or paraphrase out loud when someone asks why we think AI risk is a real thing.
Something to consider: Most people already agree that AI risk is real and serious. If you’re discussing it in areas where it’s a fringe view, you’re dealing with very unusual people, and might need to put together very different types of arguments, depending on the group. That said...
stop.ai’s one-paragraph summary is
The rest of the website has a lot of well-written stuff.
Some might be receptive to things like Yudkowsky’s TED talk:
And of course, you could appeal to authority by linking the CAIS letter, and maybe the Bletchley Declaration if statements from the international community will mean anything.
(None of those are strictly two-paragraph explanations, but I hope it helps anyway.)
I think people are concerned about things like job loss, garbage customer support, election manipulation, etc, not extinction?
AIPI Poll:
“86% of voters believe AI could accidentally cause a catastrophic event, and 70% agree that mitigating the risk of extinction from AI should be a global priority alongside other risks like pandemics and nuclear war”
“76% of voters believe artificial intelligence could eventually pose a threat to the existence of the human race, including 75% of Democrats and 78% of Republicans”
Also, this:
“Americans’ top priority is preventing dangerous and catastrophic outcomes from AI”—with relatively few prioritizing things like job loss, bias, etc.