Value extrapolation can be defined as an account of how human values, morals, and desires would be under “ideal circumstances”. These circumstances refer to the access to full information about our motivations, its origins and goals, and are proposed as the model on top of which machine ethics should be developed.
It is well known that the true origin of our moral evaluations and motivations are out of our conscious reach. Their development process has facilitated the existence of desires we wish didn’t exist or could suppress (subsequently revealing the ability for “second-order desires”, such as wishing not to wish to eat so much cake). As such, it seems clear that maybe a developed society should try to become aware, informed, of the root and paths that lead to our current values. Knowing this, understanding the unconscious cognitive processes that give rise to them could help us shift to a set of values intentionally chosen through a state of “reflective equilibrium”.
Yudkowsky’s, through Coherent Extrapolated Volition, has proposed that a extrapolation of our motivation and goals could have advantages when developing the first AI seed. The extrapolation of values, in a complementary way, seems useful in thinking a set of machines ethics, namely:
the use of real human values after the reflective process;
faster AI moral progress; dissolving preference contradictions;
simplification of the human values through elimination of artifacts;
a possible solution for human goals’ integration in AI systems;
convergence of different human values.
Further Reading & References
Coherent Extrapolated Volition by Eliezer Yudkowsky
The Singularity and Machine Ethics by Muehlhauser & Helm
“Indirect Normativity” Write-up by paulfchristiano, LW post and comments
Coherent Aggregated Volition: A Method for Deriving Goal System Content for Advanced, Beneficial AGIs by Ben Goertzel