The value-loading problem is the problem of getting an AI to value certain things, that is, writing it’s utility function. In solving this problem, you can either try to hard-code something into the function, like “paperclips good!”. This is direct specification; writing a function that values certain things, but when we want to make an AI value things like “doing the right thing” this becomes unfeasible.
Instead, you could solve the problem by having the AI figure out what you want by itself. The idea is then that the AI can figure out the aggregate of human morality and act accordingly by simply being told to “do what I mean” or something similar. While this might require more cognitive work by the AI, it is almost certainly safer than trying to formalize morality ourselves. In theory this way of solving the problems avoids an AI that suddenly breaks down on some border case, for example a smilemaximizer filling the galaxy with tiny smileys instead of happy humans having fun.
This is all a loose paraphrasing from the last liveblogging event EY had in the FB group where he discusses open problems in FAI.
The value-loading problem is the problem of getting an AI to value certain things, that is, writing it’s utility function. In solving this problem, you can either try to hard-code something into the function, like “paperclips good!”. This is direct specification; writing a function that values certain things, but when we want to make an AI value things like “doing the right thing” this becomes unfeasible.
Instead, you could solve the problem by having the AI figure out what you want by itself. The idea is then that the AI can figure out the aggregate of human morality and act accordingly by simply being told to “do what I mean” or something similar. While this might require more cognitive work by the AI, it is almost certainly safer than trying to formalize morality ourselves. In theory this way of solving the problems avoids an AI that suddenly breaks down on some border case, for example a smilemaximizer filling the galaxy with tiny smileys instead of happy humans having fun.
This is all a loose paraphrasing from the last liveblogging event EY had in the FB group where he discusses open problems in FAI.