They need to endorse principles of non-jerkness enough to endorse indirect normativity
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)