A Friendly AI (FAI) is an artificial intelligence that benefits humanity. It is contrasted with Unfriendly AI, which includes both Malicious AI and Uncaring AI.
Goals should be defined by the Coherent Extrapolated Volition of humanity.
Goals should be defined by the aggregating the desires of humanity in a fair way, with special attention paid to the fact that people’s current dispositions reflect untrue beliefs about the world. How to do this is an unsolved problem for which Coherent Extrapolated Volition is an incomplete outline of a theory.
Superpower
Superpower: a superintelligent machine will (would?) have unprecedented powers to reshape reality, and therefore will achieve its goals with highly efficient methods that confound human expectations and desires. Such a machine’s actions would be difficult to predict because to reliably predict the way something will solve a problem, one must be at least approximately as smart as the problem solver.
Literalness
...
In general, human language and thought is designed to explain things to and interface with other humans that have similar ways of thinking. Designing an AI that does not have human failings like hate yet does understand what humans care for is an unprecedented challenge. This step, describing the goal system to the AI, builds on the solution of determining what is a fair goal system to give the AI. It conceptually precedes ensuring that the goal system remains stable under recursive self improvement.
A Friendly AI (FAI) is an artificial intelligence that benefits humanity. It is contrasted with Unfriendly AI, which includes both Malicious AI and Uncaring AI.
Goals should be defined by the aggregating the desires of humanity in a fair way, with special attention paid to the fact that people’s current dispositions reflect untrue beliefs about the world. How to do this is an unsolved problem for which Coherent Extrapolated Volition is an incomplete outline of a theory.
Superpower: a superintelligent machine will (would?) have unprecedented powers to reshape reality, and therefore will achieve its goals with highly efficient methods that confound human expectations and desires. Such a machine’s actions would be difficult to predict because to reliably predict the way something will solve a problem, one must be at least approximately as smart as the problem solver.
...
In general, human language and thought is designed to explain things to and interface with other humans that have similar ways of thinking. Designing an AI that does not have human failings like hate yet does understand what humans care for is an unprecedented challenge. This step, describing the goal system to the AI, builds on the solution of determining what is a fair goal system to give the AI. It conceptually precedes ensuring that the goal system remains stable under recursive self improvement.
Thanks.