Kaj_Sotala comments on Work harder on tabooing “Friendly AI”

Kaj_Sotala 21 May 2012 10:47 UTC
4 points
Yeah, the terminology doesn’t seem to be consistently used. On one hand, Eliezer seems to use it as a general term for “safe” AI:

Creating Friendly AI, 2001: The term “Friendly AI” refers to the production of human-benefiting, non-human-harming actions in Artificial Intelligence systems that have advanced to the point of making real-world plans in pursuit of goals.

Artificial Intelligence as a Positive and Negative Factor in Global Risk, 2006/2008 It would be a very good thing if humanity knew how to choose into existence a powerful optimization process with a particular target. Or in more colloquial terms, it would be nice if we knew how to build a nice AI.

To describe the field of knowledge needed to address that challenge, I have proposed the term “Friendly AI”. In addition to referring to a body of technique, “Friendly AI” might also refer to the product of technique—an AI created with specified motivations. When I use the term Friendly in either sense, I capitalize it to avoid confusion with the intuitive sense of “friendly”.

Complex Value Systems are Required to Realize Valuable Futures, 2011: A common reaction to first encountering the problem statement of Friendly AI (“Ensure that the creation of a generally intelligent, self-improving, eventually superintelligent system realizes a positive outcome”)...

On the other hand, some authors seem to use “Friendly AI” as a more specific term to refer to a particular kind of AI design proposed by Eliezer. For instance,

Ben Goertzel, Thoughts on AI Morality, 2002: Eliezer Yudkowsky has recently put forth a fairly detailed theory of what he calls “Friendly AI,” which is one particular approach to instilling AGI’s with morality (Yudkowsky, 2001a). The ideas presented here, in this (much briefer) essay, are rather different from Yudkowsky’s, but they are aiming at roughly the same goal.

Ben Goertzel, Apparent Limitations on the “AI Friendliness” and Related Concepts Imposed By the Complexity of the World, 2006: Eliezer Yudkowsky, in his various online writings (see links at www.singinst.org), has introduced the term “Friendly AI” to refer to powerful AI’s that are beneficent rather than malevolent or indifferent to humans.1 On the other hand, in my prior writings (see the book The Path to Posthumanity that I coauthored with Stephan Vladimir Bugaj; and my earlier online essay “Encouraging a Positive Transcension”), I have suggested an alternate approach in which much more abstract properties like “compassion”, “growth” and “choice” are used as objectives to guide the long-term evolution and behavior of AI systems. [...]

My general feeling, related here in the context of some specific arguments, is not that Friendly AI is a bad thing to pursue in any moral sense, but rather that it is very likely to be unachievable for basic conceptual reasons.

Mark Waser, Rational Universal Benevolence: Simpler, Safer, and Wiser than “Friendly AI”, 2011: Insanity is doing the same thing over and over and expecting a different result. “Friendly AI” (FAI) meets these criteria on four separate counts by expecting a good result after: 1) it not only puts all of humanity’s eggs into one basket but relies upon a totally new and untested basket, 2) it allows fear to dictate our lives, 3) it divides the universe into us vs. them, and finally 4) it rejects the value of diversity. In addition, FAI goal initialization relies on being able to correctly calculate a “Coherent Extrapolated Volition of Humanity” (CEV) via some as-yet-undiscovered algorithm. Rational Universal Benevolence (RUB) is based upon established game theory and evolutionary ethics and is simple, safe, stable, self-correcting, and sensitive to current human thinking, intuitions, and feelings. Which strategy would you prefer to rest the fate of humanity upon?