The Evolution of Humans Was Net-Negative for Human Values

(Epistemic status: publication date is significant.)

Some observers have argued that the totality of “AI safety” and “alignment” efforts to date have plausibly had a negative rather than positive impact on the ultimate prospects for safe and aligned artificial general intelligence. This perverse outcome is possible because research “intended” to help with AI alignment can have a larger impact on AI capabilities, moving existentially-risky systems closer to us in time without making corresponding cumulative progress on the alignment problem.

When things are going poorly, one is often inclined to ask “when it all went wrong.” In this context, some identify the founding of OpenAI in 2015 as a turning point, being causally downstream of safety concerns despite the fact no one who had been thinking seriously about existential risk thought the original vision of OpenAI was a good idea.

But if we’re thinking about counterfactual impacts on outcomes, rather than grading the performance of the contemporary existential-risk-reduction movement in particular, it makes sense to posit earlier turning points.

Perhaps—much earlier. Foresighted thinkers such as Marvin Minsky (1960), Alan Turing (1951), and George Eliot (1879!!) had pointed to AI takeover as something that would likely happen eventually—is the failure theirs for not starting preparations earlier? Should we go back even earlier, and blame the ancient Greeks for failing to discover evolution and therefore adopt a eugenics program that would have given their descendants higher biological intelligence with which to solve the machine intelligence alignment problem?

Or—even earlier? There’s an idea that humans are the stupidest possible creatures that could have built a technological civilization: if it could have happened at a lower level of intelligence, it would have (and higher intelligence would have no time to evolve).

But intelligence isn’t the only input into our species’s penchant for technology; our hands with opposable thumbs are well-suited for making and using tools, even though the proto-hands of our ancestors were directly adapted for climbing trees. An equally-intelligent species with a less “lucky” body plan or habitat, similar to crows (lacking hands) or octopuses (living underwater, where, e.g., fires cannot start), might not have gotten started down the path of cultural accumulation of technology—even while a more intelligent crow- or octopus-analogue might have done so.

It’s plausible that the values of humans and biological aliens overlap to a much higher degree than those of humans and AIs; we should be “happy for” other biological species that solve their alignment problem, even if their technologically-mature utopia is different from the one we would create.

But that being the case, it follows that we should regard some alien civilizations as more valuable than our own, whenever the difference in values is outweighed by a sufficiently large increase in the probability of solving the alignment problem. (Most of the value of ancestral civilizations lies in the machine superintelligences that they set off, because ancestral civilizations are small and the Future is big.) If opposable thumbs were more differentially favorable to AI capabilities than AI alignment, we should perhaps regard the evolution of humans as a tragedy: we should prefer to go extinct and be replaced by some other species that needed a higher level of intelligence in order to wield technology. The evolution of humans was net-negative for human values.