Although what if we told each party to submit goals rather than non-goal preferences? If the AI has access to a model specifying which actions lead to which consequences, then it can search for those actions that maximize the number of goals fulfilled regardless of which party submitted them, or perhaps takes a Rawlsian approach of trying to maximize the number of goals fulfilled that were submitted by whichever party will have the least number of goals fulfilled if that sequence of actions were taken, etc. That seems very imaginable to me. You can then have heuristics that constrain the search space and stuff. You can also have non-goal preferences in addition to goals if the parties have any of those.
In that light, it seems to me that the problem was inferring goals from a set of preferences which were not purely non-goal preferences but were actually presented with some unspecified goals in mind. Eg. One party wanted chocolate, but said, “I want to go to the store” instead. If that was the source of the original problem, then we can see why we might need an AI to solve it, since it calls for some lightweight mind reading. Of course, a CEV-implementing AI would have to be a mind reader anyway, since we don’t really know what our goals ultimately are given everything we could know about reality.
This still does not guarantee basic morality, but parties should at least recognize some of their ultimate goals in the end result. They might still grumble about the result not being exactly what they wanted, but we can at least scold them for lacking a spirit of compromise.
All this presupposes that enough of our actions can be reduced to ultimate goals that can be discovered, and I don’t think this process guarantees we will be satisfied with the results. For example, this might erode personal freedom to an unpleasant degree. If we would choose to live in some world X if we were wiser and nicer than we are, then it doesn’t necessarily follow that X is a Nice Place to Live as we are now. Changing ourselves to reach that level of niceness and wisdom might require unacceptably extensive modifications to our actual selves.
My recent paper touches upon preference aggregation a bit in section 8, BTW, though it’s mostly focused on the question of figuring out a single individual’s values. (Not sure how relevant that is for your comments, but thought maybe a little.)
(And all my ranting still didn’t address the fundamental difficulty: There is no rational way to choose from among different projections of values held by multiple agents, projections such as Rawlsianism and utilitarianism.)
Although what if we told each party to submit goals rather than non-goal preferences? If the AI has access to a model specifying which actions lead to which consequences, then it can search for those actions that maximize the number of goals fulfilled regardless of which party submitted them, or perhaps takes a Rawlsian approach of trying to maximize the number of goals fulfilled that were submitted by whichever party will have the least number of goals fulfilled if that sequence of actions were taken, etc. That seems very imaginable to me. You can then have heuristics that constrain the search space and stuff. You can also have non-goal preferences in addition to goals if the parties have any of those.
In that light, it seems to me that the problem was inferring goals from a set of preferences which were not purely non-goal preferences but were actually presented with some unspecified goals in mind. Eg. One party wanted chocolate, but said, “I want to go to the store” instead. If that was the source of the original problem, then we can see why we might need an AI to solve it, since it calls for some lightweight mind reading. Of course, a CEV-implementing AI would have to be a mind reader anyway, since we don’t really know what our goals ultimately are given everything we could know about reality.
This still does not guarantee basic morality, but parties should at least recognize some of their ultimate goals in the end result. They might still grumble about the result not being exactly what they wanted, but we can at least scold them for lacking a spirit of compromise.
All this presupposes that enough of our actions can be reduced to ultimate goals that can be discovered, and I don’t think this process guarantees we will be satisfied with the results. For example, this might erode personal freedom to an unpleasant degree. If we would choose to live in some world X if we were wiser and nicer than we are, then it doesn’t necessarily follow that X is a Nice Place to Live as we are now. Changing ourselves to reach that level of niceness and wisdom might require unacceptably extensive modifications to our actual selves.
My recent paper touches upon preference aggregation a bit in section 8, BTW, though it’s mostly focused on the question of figuring out a single individual’s values. (Not sure how relevant that is for your comments, but thought maybe a little.)
Thanks, I’ll look into it.
(And all my ranting still didn’t address the fundamental difficulty: There is no rational way to choose from among different projections of values held by multiple agents, projections such as Rawlsianism and utilitarianism.)