I understand CEV to mean extrapolating many different versions of each person, starting at different times in their life before the AGI gained the ability to re-write their preferences, and doing whatever the majority of sims vote for. I expect this to yield 2 for some kind of extrapolation. Defining it to exclude a weaker version of 3 that favors the AGI may prove tricky, but I believe such a definition exists and tend to believe we can find one.
{ETA: the AI doesn’t technically want to kill all humans. If it can give our extrapolated selves values that are satisfied by a tautology, it should just sit there. The end result will depend on factors like whether or not the programmers realize what happened.}
(Also, I believe that if the definition is technically uncomputable or unwieldy or simply impossible to test for moral reasons, a well-designed AGI can still form reasonable beliefs about it.)
5 and 6 seem too vague to agree or disagree with. I’ve seen at least one person use the word “wireheading” to include not just 4 (which I tentatively disagree with, especially if “all” means all) but probably also many versions of 2. If it includes anything that fails to serve evolution’s pseudo-goals in giving us our desires, then 5 seems almost trivially true.
No. 5, “Wireheading” is an established trope). So is No. 4, being addicted to dreams or simulations. There is a clear difference between the two, but they are similar in the fact that they evoke feelings of disgust due to the idea of a person abandoning or forgetting their “real-world” responsibilities out of selfishness.
EDIT: For future reference, hairyfigment’s comment was made while trope 2 was “CEV is characterized by a complex set of preferences”
I understand CEV to mean extrapolating many different versions of each person, starting at different times in their life before the AGI gained the ability to re-write their preferences, and doing whatever the majority of sims vote for. I expect this to yield 2 for some kind of extrapolation. Defining it to exclude a weaker version of 3 that favors the AGI may prove tricky, but I believe such a definition exists and tend to believe we can find one.
{ETA: the AI doesn’t technically want to kill all humans. If it can give our extrapolated selves values that are satisfied by a tautology, it should just sit there. The end result will depend on factors like whether or not the programmers realize what happened.}
(Also, I believe that if the definition is technically uncomputable or unwieldy or simply impossible to test for moral reasons, a well-designed AGI can still form reasonable beliefs about it.)
5 and 6 seem too vague to agree or disagree with. I’ve seen at least one person use the word “wireheading” to include not just 4 (which I tentatively disagree with, especially if “all” means all) but probably also many versions of 2. If it includes anything that fails to serve evolution’s pseudo-goals in giving us our desires, then 5 seems almost trivially true.
No. 5, “Wireheading” is an established trope). So is No. 4, being addicted to dreams or simulations. There is a clear difference between the two, but they are similar in the fact that they evoke feelings of disgust due to the idea of a person abandoning or forgetting their “real-world” responsibilities out of selfishness.
EDIT: For future reference, hairyfigment’s comment was made while trope 2 was “CEV is characterized by a complex set of preferences”