FAI builders do not need to be saints. No sane strategy would be set up that way. They need to endorse principles of non-jerkness enough to endorse indirect normativity (e.g. CEV). And that’s it. Morality is not sneezed into AIs by contact with the builders.
Haven’t you considered extrapolating the volition of a single person if CEV for many people looks like it won’t work out, or will take significantly longer? Three out of three non-vegetarian LessWrongers (my best model for MIRI employees, present and future, aside from you) I have discussed it with say they care about something besides sentience, like sapience. Because they have believed that that’s what they care about for a while, I think it has become their true value, and CEV based on them alone would not act on concern for sentience without sapience. These are people who take MWI and cryonics seriously, probably because you and Robin Hansen do and have argued in favor of them. And you could probably change the opinion of these people, or at least people on the road to becoming like them with a few of blog posts.
Because in HPMOR you used the word “sentience,” which is typically used in sci fi to mean sapience, (instead of using something like “having consciousness”) I am worried you are sending people down that path by letting them think HPJEV draws the moral-importance line at sapience, besides my concern that you are showing others that a professional rationalist thinks animals aren’t sentient.
I did finally read the 2004 CEV paper recently, and it was fairly reassuring in a number of ways. (The “Jews vs Palestinians cancel each other but Martin Luther King and Gandhi add together” thing sounded… plausible but a little too cutely elegant for me to trust at first glance.)
I guess the question I have is (this is less relevant to the current discussion but I’m pretty curious) - in the event where CEV fails to produce a useful outcome (i.e. values diverge too much), is there a backup plan, that doesn’t hinge on someone’s judgment? (Is there a backup plan, period?)
They need to endorse principles of non-jerkness enough to endorse indirect normativity
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)
FAI builders do not need to be saints. No sane strategy would be set up that way. They need to endorse principles of non-jerkness enough to endorse indirect normativity (e.g. CEV). And that’s it. Morality is not sneezed into AIs by contact with the builders.
Haven’t you considered extrapolating the volition of a single person if CEV for many people looks like it won’t work out, or will take significantly longer? Three out of three non-vegetarian LessWrongers (my best model for MIRI employees, present and future, aside from you) I have discussed it with say they care about something besides sentience, like sapience. Because they have believed that that’s what they care about for a while, I think it has become their true value, and CEV based on them alone would not act on concern for sentience without sapience. These are people who take MWI and cryonics seriously, probably because you and Robin Hansen do and have argued in favor of them. And you could probably change the opinion of these people, or at least people on the road to becoming like them with a few of blog posts.
Because in HPMOR you used the word “sentience,” which is typically used in sci fi to mean sapience, (instead of using something like “having consciousness”) I am worried you are sending people down that path by letting them think HPJEV draws the moral-importance line at sapience, besides my concern that you are showing others that a professional rationalist thinks animals aren’t sentient.
I did finally read the 2004 CEV paper recently, and it was fairly reassuring in a number of ways. (The “Jews vs Palestinians cancel each other but Martin Luther King and Gandhi add together” thing sounded… plausible but a little too cutely elegant for me to trust at first glance.)
I guess the question I have is (this is less relevant to the current discussion but I’m pretty curious) - in the event where CEV fails to produce a useful outcome (i.e. values diverge too much), is there a backup plan, that doesn’t hinge on someone’s judgment? (Is there a backup plan, period?)
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)