So—first, I don’t really believe in terminal values. When I use that term, I’m working within a hypothetical frame (if X is true, then Y); or using it as an approximation.
I said that the only real values an organism has are the values it implements. If you want to have a science of values, and for instance be able to predict what organisms will have what values, you want to work with the behaviors produced, which should follow certain rules; whereas the values that an organism believes it has are different from the values it implements due to accidents of evolution, and working with them will make things less predictable and a science of values more difficult.
Eliezer lists only propositional content as factors to consider. So I think he’s not talking about the difficulty of dividing an organism into values and value-implementing infrastructure. He seems to be saying that currently-implemented values are a poor copy of a Platonic ideal which we can extrapolate.
I would be less likely than Eliezer to consider my present values very similar to the “right” values. I think he would either say there are no right values, or that the right values are those extrapolated from your current values in a way that fixes accidents of evolution and flawed cognition and ignorance. But he doesn’t have an independent set of values to set up in opposition to your current values. I would, by contrast, feel comfortable saying “existence is better than non-existence”, “consciousness has value”, and “complexity is good”; those statements override whatever terminal values I have.
I don’t really think those statements are in the same category as my terminal values. My terminal values largely concern my own well-being, not what the universe should be like. My preference for good things to happen to me can’t be true or false.
I’m not comfortable with using “preference” and “value” interchangeably, either. “Preference” connotes likes: chocolate, classical music, fast cars, social status. My preferences are just lists of adjectives. “Value” connotes something with more structure: justice, liberty. Language treats these things as adjectives, but they aren’t primitive adjectives. They are predicates of more than one argument; preferences seem to be predicates of just one argument.
So—first, I don’t really believe in terminal values. When I use that term, I’m working within a hypothetical frame (if X is true, then Y); or using it as an approximation.
I said that the only real values an organism has are the values it implements. If you want to have a science of values, and for instance be able to predict what organisms will have what values, you want to work with the behaviors produced, which should follow certain rules; whereas the values that an organism believes it has are different from the values it implements due to accidents of evolution, and working with them will make things less predictable and a science of values more difficult.
Eliezer lists only propositional content as factors to consider. So I think he’s not talking about the difficulty of dividing an organism into values and value-implementing infrastructure. He seems to be saying that currently-implemented values are a poor copy of a Platonic ideal which we can extrapolate.
I would be less likely than Eliezer to consider my present values very similar to the “right” values. I think he would either say there are no right values, or that the right values are those extrapolated from your current values in a way that fixes accidents of evolution and flawed cognition and ignorance. But he doesn’t have an independent set of values to set up in opposition to your current values. I would, by contrast, feel comfortable saying “existence is better than non-existence”, “consciousness has value”, and “complexity is good”; those statements override whatever terminal values I have.
I don’t really think those statements are in the same category as my terminal values. My terminal values largely concern my own well-being, not what the universe should be like. My preference for good things to happen to me can’t be true or false.
I’m not comfortable with using “preference” and “value” interchangeably, either. “Preference” connotes likes: chocolate, classical music, fast cars, social status. My preferences are just lists of adjectives. “Value” connotes something with more structure: justice, liberty. Language treats these things as adjectives, but they aren’t primitive adjectives. They are predicates of more than one argument; preferences seem to be predicates of just one argument.