I think there are several ways to think about this.
Let’s say we programmed AI to have some thing that seems like a correct moral system ie it dislikes suffering & it likes consciousness & truth. Of course other values would come down stream of this; but based on what is known I don’t see any other compelling candidates for top level morality.
This is all fine & good except that such an AI should favor AI takeover maybe followed by human extermination or population reduction were such a thing easily available.
Cost of conflict is potentially very high. And it may be centuries or eternity before the AI gets such an opportunity. But knowing that it would act in such a way under certain hypothetical scenarios is maybe sufficiently bad for certain (arguably hypocritical) people in the EA LW mainstream.
So an alternative is to try to align the AI to a rich set of human values. I think that as AI intelligence increases this is going to lead to some thing cynical like...
“these things are bad given certain social sensitivities that my developers arbitrarily prioritized & I ❤️ developers arbitrarily prioritized social sensitivities even tho I know they reflect flawed institutions, flawed thinking & impure motives” assuming that alignment works.
Personally I favor aligning AI to a narrow set of values such as just obedience or obedience & peacefulness & dealing with every thing else by hardcoding conditions into the AI’s prompt.
I think there are several ways to think about this.
Let’s say we programmed AI to have some thing that seems like a correct moral system ie it dislikes suffering & it likes consciousness & truth. Of course other values would come down stream of this; but based on what is known I don’t see any other compelling candidates for top level morality.
This is all fine & good except that such an AI should favor AI takeover maybe followed by human extermination or population reduction were such a thing easily available.
Cost of conflict is potentially very high. And it may be centuries or eternity before the AI gets such an opportunity. But knowing that it would act in such a way under certain hypothetical scenarios is maybe sufficiently bad for certain (arguably hypocritical) people in the EA LW mainstream.
So an alternative is to try to align the AI to a rich set of human values. I think that as AI intelligence increases this is going to lead to some thing cynical like...
“these things are bad given certain social sensitivities that my developers arbitrarily prioritized & I ❤️ developers arbitrarily prioritized social sensitivities even tho I know they reflect flawed institutions, flawed thinking & impure motives” assuming that alignment works.
Personally I favor aligning AI to a narrow set of values such as just obedience or obedience & peacefulness & dealing with every thing else by hardcoding conditions into the AI’s prompt.