I’m actually fairly concerned about the possibility of you influencing the beliefs of AI researchers, in particular.
I’m not sure if it ends up mattering for FAI, if executed as currently outlined. My understanding is that the point is that it’ll be able to predict the collective moral values of humanity-over-time (or safely fail to do so), and your particular guesses about ethical-hidden-variables shouldn’t matter.
But I can imagine plausible scenarios where various ethical-blind-spots on the part of the FAI team, or people influenced by it, end up mattering a great deal in a pretty terrifying way. (Maybe people in that cluster decide they have a better plan, and leave and do their own thing, where ethical-blind-spots/hidden-variables matter more).
This concern extends beyond vegetarianism and doesn’t have a particular recommended course of action beyond “please be careful about your moral reasoning and public discussion thereof”, which presumably you’re doing already, or trying to.
FAI builders do not need to be saints. No sane strategy would be set up that way. They need to endorse principles of non-jerkness enough to endorse indirect normativity (e.g. CEV). And that’s it. Morality is not sneezed into AIs by contact with the builders.
Haven’t you considered extrapolating the volition of a single person if CEV for many people looks like it won’t work out, or will take significantly longer? Three out of three non-vegetarian LessWrongers (my best model for MIRI employees, present and future, aside from you) I have discussed it with say they care about something besides sentience, like sapience. Because they have believed that that’s what they care about for a while, I think it has become their true value, and CEV based on them alone would not act on concern for sentience without sapience. These are people who take MWI and cryonics seriously, probably because you and Robin Hansen do and have argued in favor of them. And you could probably change the opinion of these people, or at least people on the road to becoming like them with a few of blog posts.
Because in HPMOR you used the word “sentience,” which is typically used in sci fi to mean sapience, (instead of using something like “having consciousness”) I am worried you are sending people down that path by letting them think HPJEV draws the moral-importance line at sapience, besides my concern that you are showing others that a professional rationalist thinks animals aren’t sentient.
I did finally read the 2004 CEV paper recently, and it was fairly reassuring in a number of ways. (The “Jews vs Palestinians cancel each other but Martin Luther King and Gandhi add together” thing sounded… plausible but a little too cutely elegant for me to trust at first glance.)
I guess the question I have is (this is less relevant to the current discussion but I’m pretty curious) - in the event where CEV fails to produce a useful outcome (i.e. values diverge too much), is there a backup plan, that doesn’t hinge on someone’s judgment? (Is there a backup plan, period?)
They need to endorse principles of non-jerkness enough to endorse indirect normativity
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)
I’m actually fairly concerned about the possibility of you influencing the beliefs of AI researchers, in particular.
I’m not sure if it ends up mattering for FAI, if executed as currently outlined. My understanding is that the point is that it’ll be able to predict the collective moral values of humanity-over-time (or safely fail to do so), and your particular guesses about ethical-hidden-variables shouldn’t matter.
But I can imagine plausible scenarios where various ethical-blind-spots on the part of the FAI team, or people influenced by it, end up mattering a great deal in a pretty terrifying way. (Maybe people in that cluster decide they have a better plan, and leave and do their own thing, where ethical-blind-spots/hidden-variables matter more).
This concern extends beyond vegetarianism and doesn’t have a particular recommended course of action beyond “please be careful about your moral reasoning and public discussion thereof”, which presumably you’re doing already, or trying to.
FAI builders do not need to be saints. No sane strategy would be set up that way. They need to endorse principles of non-jerkness enough to endorse indirect normativity (e.g. CEV). And that’s it. Morality is not sneezed into AIs by contact with the builders.
Haven’t you considered extrapolating the volition of a single person if CEV for many people looks like it won’t work out, or will take significantly longer? Three out of three non-vegetarian LessWrongers (my best model for MIRI employees, present and future, aside from you) I have discussed it with say they care about something besides sentience, like sapience. Because they have believed that that’s what they care about for a while, I think it has become their true value, and CEV based on them alone would not act on concern for sentience without sapience. These are people who take MWI and cryonics seriously, probably because you and Robin Hansen do and have argued in favor of them. And you could probably change the opinion of these people, or at least people on the road to becoming like them with a few of blog posts.
Because in HPMOR you used the word “sentience,” which is typically used in sci fi to mean sapience, (instead of using something like “having consciousness”) I am worried you are sending people down that path by letting them think HPJEV draws the moral-importance line at sapience, besides my concern that you are showing others that a professional rationalist thinks animals aren’t sentient.
I did finally read the 2004 CEV paper recently, and it was fairly reassuring in a number of ways. (The “Jews vs Palestinians cancel each other but Martin Luther King and Gandhi add together” thing sounded… plausible but a little too cutely elegant for me to trust at first glance.)
I guess the question I have is (this is less relevant to the current discussion but I’m pretty curious) - in the event where CEV fails to produce a useful outcome (i.e. values diverge too much), is there a backup plan, that doesn’t hinge on someone’s judgment? (Is there a backup plan, period?)
Indirect Normativity is more a matter of basic sanity than non-jerky altruism. I could be a total jerk and still realize that I wanted the AI to do moral philosophy for me. Of course, even if I did this, the world would turn out better than anyone could imagine, for everyone. So yeah, I think it really has more to do with being A) sane enough to choose Indirect Normativity, and B) mostly human.
Also, I would regard it as a straight-up mistake for a jerk to extrapolate anything but their own values. (Or a non-jerk for that matter). If they are truly altruistic, the extrapolation should reflect this. If they are not, building altruism or egalitarianism in at a basic level is just dumb (for them, nice for me).
(Of course then there are arguments for being honest and building in altruism at a basic level like your supporters wanted you to. Which then suggests the strategy of building in altruism towards only your supporters, which seems highly prudent if there is any doubt about who we should be extrapolating. And then there is the meta-uncertain argument that you shouldn’t do too much clever reasoning outside of adult supervision. And then of course there is the argument that these details have low VOI compared to making the damn thing work at all. At which point I will shut up.)