I think this is a good point and one reason to favor more CEV style solutions to alignment, if they are possible, rather than solutions which align make the values of the AI relatively “closer” to our original values.
Eh, CEV got rightly ditched as an actual solution to the alignment problem. The basic problem is it assumed that there was a objective moral reality, and we have little evidence of that. It’s very possible morals are subjective, which outright makes CEV non-viable. May that alignment solution never be revived.
CEV is a sketch of operationalization of carefully deciding which goals end up being pursued, an alignment target. Its content doesn’t depend on philosophical status of such goals or on how CEV gets instantiated, such as whether it gets to be used directly in 21st century by first AGIs or if it comes about later, when we need to get serious about making use of the cosmic endowment.
My preferred implementation of CEV (in the spirit of exploratory engineering) looks like a large collection of mostly isolated simulated human civilizations, where AGIs individually assigned to them perform prediction of CEV in many different value-laden ways (current understanding of values influences which details are predicted with morally relevant accuracy) and use it to guide their civilizations, depending on what is allowed by the rules of setting up a particular civilization. This as a whole gives a picture of path-dependency and tests prediction of CEV within CEV, so that it becomes possible to make more informed decisions on aggregation of results of different initial conditions (seeking coherence), and on choice of initial conditions.
The primary issue with this implementation is potential mindcrime, though it might be possible to selectively modulate the precision used to simulate specific parts of these civilizations to reduce moral weight of simulated undesirable events, or for the civilization-guiding AGIs to intervene where necessary.
The basic problem is it assumed that there was a objective moral reality, and we have little evidence of that. It’s very possible morals are subjective, which outright makes CEV non-viable.
Do you mean by “objective moral reality” and morals “being subjective” something that interacts (at all) with the above description of CEV? Are you thinking of a very different meaning of CEV?
The basic problem is it assumed that there was a objective moral reality, and we have little evidence of that.
AFAICT this is false. CEV runs a check to see if human values turn out to cohere with each other (this says nothing about whether there is an objective morality), and if it finds that they do not, it gracefully shuts down.
My sense from reading the arbital post on it is that Eliezer still considers it the ideal sort of thing to do with an advanced AGI after we gain a really high degree of confidence it it’s ability to do very complex things (which admittedly means it’s not very helpful for solving our immediate problems). I think some people disagree about it but your statement as-worded seems mostly false to me.
Only if one interprets “subjective” as meaning “arbitrary”. The second meaning of “subjective”, which it shares with “phenomena”, is quite in accordance with moral realism and CEV-like aspirations.
I think this is a good point and one reason to favor more CEV style solutions to alignment, if they are possible, rather than solutions which align make the values of the AI relatively “closer” to our original values.
Eh, CEV got rightly ditched as an actual solution to the alignment problem. The basic problem is it assumed that there was a objective moral reality, and we have little evidence of that. It’s very possible morals are subjective, which outright makes CEV non-viable. May that alignment solution never be revived.
CEV is a sketch of operationalization of carefully deciding which goals end up being pursued, an alignment target. Its content doesn’t depend on philosophical status of such goals or on how CEV gets instantiated, such as whether it gets to be used directly in 21st century by first AGIs or if it comes about later, when we need to get serious about making use of the cosmic endowment.
My preferred implementation of CEV (in the spirit of exploratory engineering) looks like a large collection of mostly isolated simulated human civilizations, where AGIs individually assigned to them perform prediction of CEV in many different value-laden ways (current understanding of values influences which details are predicted with morally relevant accuracy) and use it to guide their civilizations, depending on what is allowed by the rules of setting up a particular civilization. This as a whole gives a picture of path-dependency and tests prediction of CEV within CEV, so that it becomes possible to make more informed decisions on aggregation of results of different initial conditions (seeking coherence), and on choice of initial conditions.
The primary issue with this implementation is potential mindcrime, though it might be possible to selectively modulate the precision used to simulate specific parts of these civilizations to reduce moral weight of simulated undesirable events, or for the civilization-guiding AGIs to intervene where necessary.
Do you mean by “objective moral reality” and morals “being subjective” something that interacts (at all) with the above description of CEV? Are you thinking of a very different meaning of CEV?
I think I might be thinking of a very different kind of CEV.
AFAICT this is false. CEV runs a check to see if human values turn out to cohere with each other (this says nothing about whether there is an objective morality), and if it finds that they do not, it gracefully shuts down.
My sense from reading the arbital post on it is that Eliezer still considers it the ideal sort of thing to do with an advanced AGI after we gain a really high degree of confidence it it’s ability to do very complex things (which admittedly means it’s not very helpful for solving our immediate problems). I think some people disagree about it but your statement as-worded seems mostly false to me.
(I recommend folks read the full article: https://arbital.com/p/cev/ )
Only if one interprets “subjective” as meaning “arbitrary”. The second meaning of “subjective”, which it shares with “phenomena”, is quite in accordance with moral realism and CEV-like aspirations.