I’m convinced of utilitarianism as the proper moral construct, but I don’t think an AI should use a free-ranging utilitarianism, because it’s just too dangerous. A relatively small calculation error, or a somewhat eccentric view of the future can lead to very bad outcomes indeed.
A really smart, powerful AI, it seems to me, should be constrained by rules of behavior (no wiping out humanity/no turning every channel into 24-7 porn/no putting everyone to work in the paperclip factory), The assumption that something very smart would necessarily reach correct utiltarian views seems facially false; it could assume that humans must think like it does, or assume that dogs generate more utility with less effort due to their easier ability to be happy, or decide that humans need more superintelligent machines in a great big hurry and should build them regardless of anything else.
And maybe it’d be right here or there. But maybe not. I think almost definitionally that FAI cannot be full-on, free-range utilitarian of any stripe. Am I wrong?
The ideas under consideration aren’t as simple as having the AI act by pleasure utlitarianism or preference utilitarianism, because we actually care about a whole lot of things in our evaluation of futures. Many of the things that might horrify us are things we’ve rarely or never needed to be consciously aware of, because nobody currently has the power or the desire to enact them; but if we miss adding just one hidden rule, we could wind up in a horrible future.
Thus “rule-following AI” has to get human nature just as right as “utilitarian AI” in order to reach a good outcome. For that reason, Eliezer et al. are looking for more meta ways of going about choosing a utility function. The reason why they prefer utilitarianism to rule-based AI is another still-disputed area on this site (I should point out that I agree with Eliezer here).
Why are you more concerned about something with unlimited ability to self reflect making a calculation error than about the above being a calculation error? The AI could implement the above if the calculation implicit in it is correct.
I’m convinced of utilitarianism as the proper moral construct, but I don’t think an AI should use a free-ranging utilitarianism, because it’s just too dangerous. A relatively small calculation error, or a somewhat eccentric view of the future can lead to very bad outcomes indeed.
A really smart, powerful AI, it seems to me, should be constrained by rules of behavior (no wiping out humanity/no turning every channel into 24-7 porn/no putting everyone to work in the paperclip factory), The assumption that something very smart would necessarily reach correct utiltarian views seems facially false; it could assume that humans must think like it does, or assume that dogs generate more utility with less effort due to their easier ability to be happy, or decide that humans need more superintelligent machines in a great big hurry and should build them regardless of anything else.
And maybe it’d be right here or there. But maybe not. I think almost definitionally that FAI cannot be full-on, free-range utilitarian of any stripe. Am I wrong?
The ideas under consideration aren’t as simple as having the AI act by pleasure utlitarianism or preference utilitarianism, because we actually care about a whole lot of things in our evaluation of futures. Many of the things that might horrify us are things we’ve rarely or never needed to be consciously aware of, because nobody currently has the power or the desire to enact them; but if we miss adding just one hidden rule, we could wind up in a horrible future.
Thus “rule-following AI” has to get human nature just as right as “utilitarian AI” in order to reach a good outcome. For that reason, Eliezer et al. are looking for more meta ways of going about choosing a utility function. The reason why they prefer utilitarianism to rule-based AI is another still-disputed area on this site (I should point out that I agree with Eliezer here).
Why are you more concerned about something with unlimited ability to self reflect making a calculation error than about the above being a calculation error? The AI could implement the above if the calculation implicit in it is correct.