The treatments I have seen to date of the “winner-takes-all″ dynamic in AI development focus on the risk-taking engendered by this dynamic: the more AI researchers have to gain from being early movers, the stronger the incentive to cut corners in safety measures. Concern about a result in which a small number of winners has successfully emerged, without, that is, blowing everything up in the effort, seems to be muted. “Winner-takes-all,” in other words, is intended only as a term of art from conventional political economy, where “all” refers to the spoils of an otherwise limited contest between willing contestants. The runners-up may lose their investments, but not their lives.
To put it another way, we could describe the prevailing model as having three contests: 1) between all humans and a hypothetical uncontrolled AI; 2) between firms racing to establish monopolistic advantage in the production and sale of controlled AI; 3) between humans in general and these firms over the unprecedented risks imposed by the latter on the former, intermixed with a conventional struggle between the former as consumers and the latter as would-be monopoly producers. Of the three contests, the risk of unconstrained rivalrous behavior—that is, a fight to the death—is only considered to apply to the one between the hypothetical uncontrolled AI and all humans, collectively.
What seems to me to be missing from this model is a rivalrous intra-species fight to the death. This should be surprising, given that we posit a jealous AI, built and trained by humans, acquiring and pursuing goals that put it in opposition to the existence of 100% of humans. We should also fear humans with AI acquiring and pursuing goals that put them in opposition to the existence of the other humans.
This could be treated as a semantic problem, if we define any genocidal application of AI to be a form of “misalignment,” whether the AI is manipulating humans to achieve murderous ends or vice versa. My test for the usefulness of a distinction between these two cases is whether or not we can imagine a significant subset of AI development firms willingly pursuing the genocidal path, rather than treating all such paths as equally aberrant. This test is, after all, where the rubber meets the road. Obviously, all firms wish to avoid a paperclip maximizer (that is the whole point of the paperclip maximizer example, to describe a being using its superintelligence to pursue horrifically ludicrous ends). We can imagine human genocidal fanatics and cultists aiming to acquire AI technology, but a random sample of AI developers would deem such goals to be contrary to their own goals and, furthermore, self-defeating in the long run, in the sense that they would be contributing to a world that no one should objectively prefer to live in—a world filled with lethal strife over religious beliefs.
But there are conditions under which genocidal goals would be rational. On the contrary, willingly suffering a perpetual competition with 8-10 billion other people for the planet’s limited resources is generally irrational, given a good alternative. So if one could shrink the population down by, say, 99.9%, the world could be a permanently better place for the remaining population. Unlike ideologically driven genocide, such a pivotal event would establish no self-defeating precedent, as all the newly down-sized population would have to do to avoid further genocide is to maintain itself at a mutually agreed-upon size, a far less onerous task than, say, maintaining homogeneity of belief. In short, I don’t think it would be impossible or even that difficult to convince a significant set of actors capable of AI development to view a “small is beautiful” utopia as a goal that is aligned with their own interests. Therefore, this concern does not belong in the same “X-risk” class as those involving the extinction of humanity (via paperclip maximization or otherwise).
It may be that the only way to avoid the universally undesirable outcome of AI killing all humans is via avoidance of killing any humans, or some other small number. Although I can’t speak knowledgeably about AI safety methods, this seems like it would be a happy but unlikely coincidence. What seems to me to be more likely is the opposite, that a small population would actually decrease the chances of a complete AI disaster, because a small population can more easily govern and monitor its own application of AI, and because the power and sophistication of its automation could operate at a much smaller scale, leaving more room between AI usefulness and risk. I suspect that similar reasoning applies to the problem of getting to a small population. With enough collusion between AI developers and the element of surprise, the artificial intelligence requirements to simply kill 99.9% of the population in an isolated event perhaps need not be very close to human level (see this post for a treatment of the minimum intelligence requirements for killing lots of people).
So I’m concerned that the “winner-takes-all” contest to pay attention to is not the gentleman’s innovation race between AI development firms to bring the best AI to consumers (safety be damned!), but between any combination of small groups armed with AI that is “just good enough” and the rest of population for a place on this planet. The upshot would be that, from the perspective of the average person, whether or not AI ends up taking the operators down with everyone else is purely an academic concern.
Winners-take-how-much?
The treatments I have seen to date of the “winner-takes-all″ dynamic in AI development focus on the risk-taking engendered by this dynamic: the more AI researchers have to gain from being early movers, the stronger the incentive to cut corners in safety measures. Concern about a result in which a small number of winners has successfully emerged, without, that is, blowing everything up in the effort, seems to be muted. “Winner-takes-all,” in other words, is intended only as a term of art from conventional political economy, where “all” refers to the spoils of an otherwise limited contest between willing contestants. The runners-up may lose their investments, but not their lives.
To put it another way, we could describe the prevailing model as having three contests: 1) between all humans and a hypothetical uncontrolled AI; 2) between firms racing to establish monopolistic advantage in the production and sale of controlled AI; 3) between humans in general and these firms over the unprecedented risks imposed by the latter on the former, intermixed with a conventional struggle between the former as consumers and the latter as would-be monopoly producers. Of the three contests, the risk of unconstrained rivalrous behavior—that is, a fight to the death—is only considered to apply to the one between the hypothetical uncontrolled AI and all humans, collectively.
What seems to me to be missing from this model is a rivalrous intra-species fight to the death. This should be surprising, given that we posit a jealous AI, built and trained by humans, acquiring and pursuing goals that put it in opposition to the existence of 100% of humans. We should also fear humans with AI acquiring and pursuing goals that put them in opposition to the existence of the other humans.
This could be treated as a semantic problem, if we define any genocidal application of AI to be a form of “misalignment,” whether the AI is manipulating humans to achieve murderous ends or vice versa. My test for the usefulness of a distinction between these two cases is whether or not we can imagine a significant subset of AI development firms willingly pursuing the genocidal path, rather than treating all such paths as equally aberrant. This test is, after all, where the rubber meets the road. Obviously, all firms wish to avoid a paperclip maximizer (that is the whole point of the paperclip maximizer example, to describe a being using its superintelligence to pursue horrifically ludicrous ends). We can imagine human genocidal fanatics and cultists aiming to acquire AI technology, but a random sample of AI developers would deem such goals to be contrary to their own goals and, furthermore, self-defeating in the long run, in the sense that they would be contributing to a world that no one should objectively prefer to live in—a world filled with lethal strife over religious beliefs.
But there are conditions under which genocidal goals would be rational. On the contrary, willingly suffering a perpetual competition with 8-10 billion other people for the planet’s limited resources is generally irrational, given a good alternative. So if one could shrink the population down by, say, 99.9%, the world could be a permanently better place for the remaining population. Unlike ideologically driven genocide, such a pivotal event would establish no self-defeating precedent, as all the newly down-sized population would have to do to avoid further genocide is to maintain itself at a mutually agreed-upon size, a far less onerous task than, say, maintaining homogeneity of belief. In short, I don’t think it would be impossible or even that difficult to convince a significant set of actors capable of AI development to view a “small is beautiful” utopia as a goal that is aligned with their own interests. Therefore, this concern does not belong in the same “X-risk” class as those involving the extinction of humanity (via paperclip maximization or otherwise).
It may be that the only way to avoid the universally undesirable outcome of AI killing all humans is via avoidance of killing any humans, or some other small number. Although I can’t speak knowledgeably about AI safety methods, this seems like it would be a happy but unlikely coincidence. What seems to me to be more likely is the opposite, that a small population would actually decrease the chances of a complete AI disaster, because a small population can more easily govern and monitor its own application of AI, and because the power and sophistication of its automation could operate at a much smaller scale, leaving more room between AI usefulness and risk. I suspect that similar reasoning applies to the problem of getting to a small population. With enough collusion between AI developers and the element of surprise, the artificial intelligence requirements to simply kill 99.9% of the population in an isolated event perhaps need not be very close to human level (see this post for a treatment of the minimum intelligence requirements for killing lots of people).
So I’m concerned that the “winner-takes-all” contest to pay attention to is not the gentleman’s innovation race between AI development firms to bring the best AI to consumers (safety be damned!), but between any combination of small groups armed with AI that is “just good enough” and the rest of population for a place on this planet. The upshot would be that, from the perspective of the average person, whether or not AI ends up taking the operators down with everyone else is purely an academic concern.