I’m not sure what I was expecting, but I was a little surprised after seeing you say you object to objective morality. I probably don’t understand CEV well enough and I am pretty sure this is not the case, but it seems like there is so much similarity between CEV and some form of objective morality as described above. In other words, if you don’t think moral beliefs will eventually converge, given enough intelligence, reflection, and gathering data, etc, then how do you convince someone that FAI will make the “correct” decisions based on the extrapolated volition?
CEV in its current form is quite under-specified. I expect that there would exist many, many different ways of specifying it, each of which would produce a different CEV that would converge at a different solution.
For example, Tarleton (2010) notes that CEV is really a family of algorithms which share the following features:
Meta-algorithm: Most of the AGI’s goals will be obtained at run-time from human minds, rather than explicitly programmed in before run-time.
Factually correct beliefs: The AGI will attempt to obtain correct answers to various factual questions, in order to modify preferences or desires that are based upon false factual beliefs.
Singleton: Only one superintelligent AGI is to be constructed, and it is to take control of the world with whatever goal function is decided upon.
Reflection: Individual or group preferences are reflected upon and revised.
Preference aggregation: The set of preferences of a whole group are to be combined somehow.
He comments:
The set of factually correcting, singleton, reflective, aggregative meta-algorithms is larger than just the CEV algorithm. For example, there is no reason to suppose that factual correction, reflection, and aggregation, performed in any order, will give the same result; therefore, there are at least 6 variants depending upon ordering of these various
processes, and many variants if we allow small increments of these processes to be interleaved. CEV also stipulates that the algorithm should extrapolate ordinary human-human social interactions concurrently with the processes of reflection, factual correction and preference aggregation; this requirement could be dropped.
Although one of Eliezer’s desired characteristics for CEV was to ”avoid creating a motive for modern-day humans to fight over the initial dynamic”, a more rigorous definition of CEV will probably require making many design choices for which there will not be any objective answer, and which may be influenced by the designer’s values. The notion that our values should be extrapolated according to some specific criteria is by itself a value-laden proposal: it might be argued that it was enough to start off from our current-day values just as they are, and then incorporate additional extrapolation only if our current values said that we should do so. But doing so would not be a value-neutral decision either, but rather one supporting the values of those who think that there should be no extrapolation, rather than of those who think there should be.
I don’t find any of these issues to be problems, though: as long as CEV found any of the solutions in the set-of-final-values-that-I-wouldn’t-consider-horrible, the fact that the solution isn’t unique isn’t much of an issue. Of course, it’s quite possible that CEV will hit on some solution in that set that I would judge to be inferior to many others also in that set, but so it goes.
I’m not sure what I was expecting, but I was a little surprised after seeing you say you object to objective morality. I probably don’t understand CEV well enough and I am pretty sure this is not the case, but it seems like there is so much similarity between CEV and some form of objective morality as described above. In other words, if you don’t think moral beliefs will eventually converge, given enough intelligence, reflection, and gathering data, etc, then how do you convince someone that FAI will make the “correct” decisions based on the extrapolated volition?
CEV in its current form is quite under-specified. I expect that there would exist many, many different ways of specifying it, each of which would produce a different CEV that would converge at a different solution.
For example, Tarleton (2010) notes that CEV is really a family of algorithms which share the following features:
Meta-algorithm: Most of the AGI’s goals will be obtained at run-time from human minds, rather than explicitly programmed in before run-time.
Factually correct beliefs: The AGI will attempt to obtain correct answers to various factual questions, in order to modify preferences or desires that are based upon false factual beliefs.
Singleton: Only one superintelligent AGI is to be constructed, and it is to take control of the world with whatever goal function is decided upon.
Reflection: Individual or group preferences are reflected upon and revised.
Preference aggregation: The set of preferences of a whole group are to be combined somehow.
He comments:
Although one of Eliezer’s desired characteristics for CEV was to ”avoid creating a motive for modern-day humans to fight over the initial dynamic”, a more rigorous definition of CEV will probably require making many design choices for which there will not be any objective answer, and which may be influenced by the designer’s values. The notion that our values should be extrapolated according to some specific criteria is by itself a value-laden proposal: it might be argued that it was enough to start off from our current-day values just as they are, and then incorporate additional extrapolation only if our current values said that we should do so. But doing so would not be a value-neutral decision either, but rather one supporting the values of those who think that there should be no extrapolation, rather than of those who think there should be.
I don’t find any of these issues to be problems, though: as long as CEV found any of the solutions in the set-of-final-values-that-I-wouldn’t-consider-horrible, the fact that the solution isn’t unique isn’t much of an issue. Of course, it’s quite possible that CEV will hit on some solution in that set that I would judge to be inferior to many others also in that set, but so it goes.