I feel like there’s a persistent assumption that not even a well aligned AI will include human choices as a step in decisions like these. Maybe it will just be a checkbox in the overall puppeteering of circumstances that the AI carries out, so keen its prediction, but for it to go completely unmentioned in any of the hypotheticals seems like a glaring omission to me.
Yes, I am worried that getting humans to endorse its decisions is very easy, so it would be better if an AI were well aligned without having to ask for human choices. But the whole system is ultimately backed on human choice, at least in terms of idealised meta-preferences. I just feel that actually asking is a poor and easily manipulatable way of getting these choices, unless it’s done very carefully.
I feel like there’s a persistent assumption that not even a well aligned AI will include human choices as a step in decisions like these. Maybe it will just be a checkbox in the overall puppeteering of circumstances that the AI carries out, so keen its prediction, but for it to go completely unmentioned in any of the hypotheticals seems like a glaring omission to me.
Yes, I am worried that getting humans to endorse its decisions is very easy, so it would be better if an AI were well aligned without having to ask for human choices. But the whole system is ultimately backed on human choice, at least in terms of idealised meta-preferences. I just feel that actually asking is a poor and easily manipulatable way of getting these choices, unless it’s done very carefully.