Yes, I am worried that getting humans to endorse its decisions is very easy, so it would be better if an AI were well aligned without having to ask for human choices. But the whole system is ultimately backed on human choice, at least in terms of idealised meta-preferences. I just feel that actually asking is a poor and easily manipulatable way of getting these choices, unless it’s done very carefully.
Yes, I am worried that getting humans to endorse its decisions is very easy, so it would be better if an AI were well aligned without having to ask for human choices. But the whole system is ultimately backed on human choice, at least in terms of idealised meta-preferences. I just feel that actually asking is a poor and easily manipulatable way of getting these choices, unless it’s done very carefully.