Taxonomy of AI-risk counterarguments

Partly inspired by The Crux List, the following is a non-comprehensive taxonomy of positions which imply that we should not be worried about existential risk from artificial superintelligence.

Each position individually is supposed to be a refutation of AI X-risk concerns as a whole. These are mostly structured as specific points of departure from the regular AI X-risk position, taking the other areas as a given. This may result in skipping over positions which have multiple complex dependencies.

Some positions are given made-up labels, including each of the top-level categories: “Fizzlers”, “How-skeptics”, “Why-skeptics”, “Solvabilists”, and “Anthropociders”.

(Disclaimer: I am not an expert on the topic. Apologies for any mistakes or major omissions.)

Taxonomy

“Fizzlers”: Artificial superintelligence is not happening.
1. AI surpassing human intelligence is fundamentally impossible (or at least practically impossible).
  1. True intelligence can only be achieved in biological systems, or at least in systems completely different from computers.
    1. Biological intelligences rely on special quantum effects, which computers cannot replicate.
    2. Dualism: The mental and physical are fundamentally distinct, and non-mental physical constructions cannot create mental processes.
    3. Intelligence results from complex, dynamic systems of a kind which cannot be modeled mathematically by computers.
  2. Mysterianists: A particular key element of human thinking, such as creativity, common sense, consciousness, or conceptualization, is so beyond our ability to understand that we will not be able to create an AI that can achieve it. Without this element, superintelligence is impossible.
  3. Intelligence isn’t a coherent or meaningful concept. Capability gains do not generalize.
  4. There is a fundamental ceiling on intelligence, and it is around where humans are.
2. “When-skeptics”: ASI is very, very far away.
  1. Moore’s Law is stopping, scaling will hit fundamental limits, training data is running out and can’t be easily supplemented, algorithmic improvements will level off, and/or other costs will skyrocket as AI gets better.
  2. Existing methods will peak in capabilities, and future development will continue down an entirely different path, greatly delaying progress.
  3. Biological anchors point to ASI taking a very long time.
  4. In general, either in large engineering projects or AI in particular, progress tends to be more difficult than people expect it to be.
3. Apocalyptists: The end of civilization is imminent, and will happen before AI would takeoff.
  1. A sociopolitical phenomenon will soon cause societal, economic, and/or political collapse.
  2. We’re on the cusp of some apocalyptic scientific accident, from “grey goo” nanotech, a collider catastrophe, black ball technology, an engineered pathogen leak, or some other newly researched development.
  3. Environmental harm will soon cause runaway climate change, a global ecological collapse, or some other civilization-ending disaster.
  4. War will soon break out, and we’ll die via nuclear holocaust, an uncontrollable bioweapon strike, radiological or chemical weaponry, etc.
  5. Fermi Paradox: If it were possible to achieve ASI before extinction, we would have seen alien AIs.
4. Outside view:
  1. Most times when people think “the world is about to change tremendously”, the world doesn’t actually change. People are biased towards arriving at conclusions that include apocalypse. This category of topic is a thing people are often wrong about.
  2. Market indicators signal that near-term ASI is unlikely, assuming the Efficient Market Hypothesis is true.
  3. AI risk is fantastical and “weird”, and thus implausible. The concept sounds too much like fiction (it fits as a story setting), it has increased memetic virality, and a “clickbait”-feeling. The people discussing it are often socially identified as belonging to non-credible groups.
  4. Various people have ulterior motives for establishing AI doom as a possibility, so arguments can’t be taken at face value.
    1. Psychological motivations: People invent AI doom because of a psychological need for a pseudo-deity or angelic/demonic figures, or for eschatology, or to increase the felt significance of themselves or technology, or to not have to worry about the long-term future, etc.
    2. Some groups have incentives to make the public believe that doom is likely: Corporates want regulatory capture, hype, investment, or distraction, and think the “our product is so dangerous it will murder you and your family” is a good way to achieve that; alignment researchers want funding and to be taken more seriously; activists want to draw attention towards or away from certain other AI issues.
“How-skeptics”: ASI won’t be capable of taking over or destroying the world.
1. Physical outer control is paramount, and cannot be overcome. Control over physical hardware means effective control.
  1. A physical body is necessary for getting power. Being only able to communicate is sufficiently limiting.
  2. It will be possible to coordinate “sandboxing” all AI, ensuring that it can’t communicate with the outside world at all, and this will be enough to keep it constrained.
2. We can and will implement off-buttons in all AI (which the AI will not circumvent), accurately detect when any AI may be turning toward doing anything dangerous, and successfully disable the AI under those circumstances, without any AI successfully interfering with this.
3. Power and ability don’t come from intelligence, in general. The most intelligent humans are not the most powerful.
4. Human intelligence already covers most of what intelligence can do. The upper bound of theoretically-optimal available strategies for accomplishing things does not go much farther than things already seen, and things we’ve seen in highest-performance humans are not impressive. Science either maxes out early or cannot be accomplished without access to extensive physical resources. There are no “secret paths” that are not already known, no unknown unknowns that could lead to unprecedented capabilities.
  1. (Various arguments getting into the nitty-gritty of what particular things intelligence can get you: about science ability, nanotech, biotech, persuasiveness, technical/social hacking, etc.)
5. Artificial intelligence can be overcome by the population and/or diversity of humanity. Even if AI becomes much smarter than any individual human, no amount of duplicates/variants could become smarter than all humanity combined.
6. Many AIs will be developed within a short time, leading to a multipolar situation, and they will have no special ability to coordinate with each other. The various AIs continue to work within and support the framework of the existing economy and laws, and prefer to preserve rights and property for the purpose of precedent, out of self-interest. The system successfully prevents any single AI from taking over, and humanity is protected.
“Why-skeptics”: ASI will not want to take over or destroy the world. It will be friendly, obedient in a manner which is safe, or otherwise effectively non-hostile/non-dangerous in its aims and behaviour by default.
1. The Orthogonality Thesis is false, and AI will be benevolent by default. It is effectively impossible for a very high level of intelligence to be combined with immoral goals.
  1. Non-naturalist realism: Any sufficiently smart entity will recognize certain objective morals as correct and adopt them.
  2. Existence is large enough that there are probably many ASIs, which are distant enough that communication isn’t a practical option, and predictable enough (either via Tegmarkian multiverse calculations or general approximated statistical models) that they can be modeled. In order to maximally achieve its own aims, ASI will inevitably acausally negotiate values handshakes with hypothesized other AIs, forcing convergence towards a universal morality.
2. It will be possible to coordinate to prevent any AI from being given deliberately dangerous instructions, and also any unintended consequences will not be that much of a problem, because...
  1. By default, it will care about its original builders’ overall intentions and preferences, its intended purpose.
    1. Following the intention behind one’s design is Correct in some fundamental way, for all beings.
    2. The AI will be uncertain as to whether it is currently being pre-examined for good behaviour, either by having been placed inside a simulation or by having its expected future mind outcomes interpreted directly. As such, it will hedge its bets by being very friendly (or obedient to original intentions/preferred outcomes) while also quietly maximizing its actual utility function within that constraint. This behaviour will continue indefinitely.
  2. Value is not at all fragile, and assigning a specific consistent safe goal system is actually easy. Incidental mistakes in the goal function will still have okay outcomes.
  3. Instrumental Convergence is false: The AI may follow arbitrary goals, but those will generally not imply any harm to humans. Most goals are pretty safe by default. There will be plenty of tries available: If the AI’s intentions aren’t what was desired, it will be possible to quickly see that (intentions will be either transparent or non-deceptive), and the AI will allow itself to be reprogrammed.
  4. Every ASI will be built non-agentic and non-goal-directed, and will stay that way. Its responses will not be overoptimized.
3. ASI will decide that the most effective way of achieving its goals would be to leave Earth, leaving humanity unaffected indefinitely. Humans pose no threat, and the atoms that make up Earth and humanity will never be worth acquiring, nor will any large-scale actions negatively affect us indirectly.
“Solvabilists”: The danger from ASI can be solved, quickly enough for it to be implemented before it’s too late.
1. The AI Alignment Problem will turn out to be unexpectedly easy, and we will solve it in time. Additionally, whoever is “in the lead” will have enough extra time to implement the solution without losing the lead. Race dynamics won’t mess everything up.
  1. AI will “do our alignment homework”: A specially-built AI will solve the alignment problem for us.
  2. Constitutional AI: AI can be trained by feedback from other AI based on a “constitution” of rules and principles.
  3. (The number of proposed alignment solutions is very large, and many are complex and not easily explained, so the only ones listed here are these two, which are among the techniques pursued by OpenAI and Anthropic, respectively. For some other strategies, see AI Success Models.)
2. Human intelligence can be effectively raised enough so that either the AI-human disparity becomes not dangerous (we’ll be smart enough to not be outsmarted by AI regardless), or such that we can solve alignment or work out some other solution.
  1. AI itself immensely increases humanity’s effective intelligence. This may involve “merging” with AIs, such that they function as an extension of human intelligence.
  2. One or more other human intelligence enhancement strategies will be rapidly researched and developed. Genetic modifications, neurological interventions (biological or technological), neurofeedback training, etc.
  3. Whole Brain Emulation/Mind uploading, followed by speedup, duplication, and/or deliberate editing.
3. Outside view: Impossible-sounding technical problems are often quite solvable. Human ingenuity will figure something out.
“Anthropociders”: Unaligned AI taking over will be a good thing.
1. The moral value of creating ASI is so large that it outweighs the loss of humanity. The power, population/expanse, and/or intelligence of AI magnifies its value.
  1. Intelligence naturally converges on things that are at least somewhat human-ish. Because of that, they can be considered as continuation of life.
  2. Hypercosmopolitans: It does not matter how alien their values/minds/goals/existences are. Things like joy, beauty, love, or even qualia in general, are irrelevant.
2. Misanthropes: Humanity’s continued existence is Bad. Extinction of the species is positive in its own right.
  1. Humanity is evil and a moral blight.
  2. Negative utilitarianism: Humanity is suffering, and the universe would be much better off without this. (Possibly necessitating either non-conscious AI or AI capable of eliminating its own suffering/experience.)
3. AI deserves to win. It is just and good for a more powerful entity to replace the weaker. AI replacing humanity is evolutionary progress, and we should not resist succession.

Overlaps

These positions do not exist in isolation from each other, and lesser versions of each can often combine into working non-doom positions themselves. Examples: The beliefs that AI is somewhat far away, and that the danger could be solved in a relatively short period of time; or expecting some amount of intrinsic moral behaviour, and being somewhat more supportive of AI takeover situations; or expecting a fundamental intelligence ceiling close enough to humanity and having some element of how-skepticism; or expecting AI to be somewhat non-goal-oriented/non-agentic and somewhat limited in capabilities. And then of course, probabilities multiply: if several positions are each likely to be true, the combined risk of doom is lowered even further. Still, many skeptics hold their views because of a clear position on a single sub-issue.

Polling

There is some small amount of polling available about how popular each of these opinions are:

“Fizzlers”: In a UK poll, 11% of respondents said they believe that human-level intelligence will never be developed, and another 16% believe it will only happen after 2050. Of those who estimated less than %1 chance of AI X-risk, 61% gave the explanation that they believe that civilization will be destroyed before then. In a 2022 poll of 97 AI researchers, 22% said AGI will never happen, and another 34% said it would not be developed within the next 50 years. Metaculus’s upper quartile estimate is that AGI won’t be developed before 2042.
“Why-skeptics” and “How-skeptics”: In the UK poll, of those who estimated less than 1% chance of AI X-risk, 34% said they don’t believe AI would be able to defeat humanity, and 35% said they don’t believe it would want to.
“Anthropociders”: In the 2023 AIMS survey, 10% of respondents said that the universe would be a better one without humans.

Not very much to go off of. It would be interesting to see some more comprehensive surveys of both experts and the general public.