CEV is our current proposal for what ought to be done once you have AGI flourishing around. Many people have had bad feelings about this. When in Singularity Institute, I decided to write a text do discuss CEV, from what it is for, to how likely it is to achieve it’s goals, and how much fine-grained detail needs to be added before it is an actual theory.
Here you find a draft of the topics I’ll be discussing in that text. The purpose of showing this is that you take a look at the topics, spot something that is missing, and write a comment saying: “Hey, you forgot this problem, which, summarised, is bla bla bla bla” and also “be sure to mention paper X when discussing topic 2.a.i,”
Please take a few minutes to help me add better discussions.
Do not worry about pointing previous Less Wrong posts about it, I have them all.
Summary of CEV
Troubles with CEV
Troubles with the overall suggestion
Concepts on which CEV relies that may not be well shaped enough
Troubles with coherence
The volitions of the same person when in two different emotional states might be different—it’s as if they are two different people. Is there any good criteria by which a person’s “ultimate” volition may be determined? If not, is it certain that even the volitions of one person’s multiple selves will be convergent?
But when you start dissecting most human goals and preferences, you find they contain deeper layers of belief and expectation. If you keep stripping those away, you eventually reach raw biological drives which are not a human belief or expectation. (Though even they are beliefs and expectations of evolution, but let’s ignore that for the moment.)
Once you strip away human beliefs and expectations, nothing remains but biological drives, which even the animals have. Yes, an animal, by virtue of its biological drives and ability to act, is more than a predicting rock, but that doesn’t address the issue at hand.
Troubles with extrapolation
Are small accretions of inteligence analogous to small accretions of time in terms of identity? Is extrapolated person X still a reasonable political representant of person X?
Problems with the concept of Volition
Blue eliminating robots (Yvain post)
Error minimizer
Goals x Volitions
Problems of implementation
Undesirable solutions for hardware shortage, or time shortage (the machine decides to only CV, but not E)
Sample bias
Solving apparent non-coherence by meaning shift
Praise of CEV
Bringing the issue to practical level
Ethical strenght of egalitarianism
Alternatives to CEV
( )
( )
Normative approach
Extrapolation of written desires
Solvability of remaining problems
Historical perspectives on problems
Likelihood of solving problems before 2050
How humans have dealt with unsolvable problems in the past
Topics to discuss CEV
CEV is our current proposal for what ought to be done once you have AGI flourishing around. Many people have had bad feelings about this. When in Singularity Institute, I decided to write a text do discuss CEV, from what it is for, to how likely it is to achieve it’s goals, and how much fine-grained detail needs to be added before it is an actual theory.
Here you find a draft of the topics I’ll be discussing in that text. The purpose of showing this is that you take a look at the topics, spot something that is missing, and write a comment saying: “Hey, you forgot this problem, which, summarised, is bla bla bla bla” and also “be sure to mention paper X when discussing topic 2.a.i,”
Please take a few minutes to help me add better discussions.
Do not worry about pointing previous Less Wrong posts about it, I have them all.
Summary of CEV
Troubles with CEV
Troubles with the overall suggestion
Concepts on which CEV relies that may not be well shaped enough
Troubles with coherence
The volitions of the same person when in two different emotional states might be different—it’s as if they are two different people. Is there any good criteria by which a person’s “ultimate” volition may be determined? If not, is it certain that even the volitions of one person’s multiple selves will be convergent?
But when you start dissecting most human goals and preferences, you find they contain deeper layers of belief and expectation. If you keep stripping those away, you eventually reach raw biological drives which are not a human belief or expectation. (Though even they are beliefs and expectations of evolution, but let’s ignore that for the moment.)
Once you strip away human beliefs and expectations, nothing remains but biological drives, which even the animals have. Yes, an animal, by virtue of its biological drives and ability to act, is more than a predicting rock, but that doesn’t address the issue at hand.
Troubles with extrapolation
Are small accretions of inteligence analogous to small accretions of time in terms of identity? Is extrapolated person X still a reasonable political representant of person X?
Problems with the concept of Volition
Blue eliminating robots (Yvain post)
Error minimizer
Goals x Volitions
Problems of implementation
Undesirable solutions for hardware shortage, or time shortage (the machine decides to only CV, but not E)
Sample bias
Solving apparent non-coherence by meaning shift
Praise of CEV
Bringing the issue to practical level
Ethical strenght of egalitarianism
Alternatives to CEV
( )
( )
Normative approach
Extrapolation of written desires
Solvability of remaining problems
Historical perspectives on problems
Likelihood of solving problems before 2050
How humans have dealt with unsolvable problems in the past