I don’t see much FAI-use for mapping human values because I expect to need to solve the value-loading problem via indirect normativity rather than direct specification (see Bostrom 2014).
The value-loading problem is the problem of getting an AI to value certain things, that is, writing it’s utility function. In solving this problem, you can either try to hard-code something into the function, like “paperclips good!”. This is direct specification; writing a function that values certain things, but when we want to make an AI value things like “doing the right thing” this becomes unfeasible.
Instead, you could solve the problem by having the AI figure out what you want by itself. The idea is then that the AI can figure out the aggregate of human morality and act accordingly by simply being told to “do what I mean” or something similar. While this might require more cognitive work by the AI, it is almost certainly safer than trying to formalize morality ourselves. In theory this way of solving the problems avoids an AI that suddenly breaks down on some border case, for example a smilemaximizer filling the galaxy with tiny smileys instead of happy humans having fun.
This is all a loose paraphrasing from the last liveblogging event EY had in the FB group where he discusses open problems in FAI.
I don’t see much FAI-use for mapping human values because I expect to need to solve the value-loading problem via indirect normativity rather than direct specification (see Bostrom 2014).
Also, there is quite a lot of effort going into this already: World Values Survey, moral foundations theory, etc.
What does this mean?
The value-loading problem is the problem of getting an AI to value certain things, that is, writing it’s utility function. In solving this problem, you can either try to hard-code something into the function, like “paperclips good!”. This is direct specification; writing a function that values certain things, but when we want to make an AI value things like “doing the right thing” this becomes unfeasible.
Instead, you could solve the problem by having the AI figure out what you want by itself. The idea is then that the AI can figure out the aggregate of human morality and act accordingly by simply being told to “do what I mean” or something similar. While this might require more cognitive work by the AI, it is almost certainly safer than trying to formalize morality ourselves. In theory this way of solving the problems avoids an AI that suddenly breaks down on some border case, for example a smilemaximizer filling the galaxy with tiny smileys instead of happy humans having fun.
This is all a loose paraphrasing from the last liveblogging event EY had in the FB group where he discusses open problems in FAI.