Good link. I agree with guarding against wrong epistemic estimates of values (good wording).
Our disagreement comes down to this (I think): “I want to want X” Is this
a) an epistemic estimate of a value
b) a value in itself, pattern matching “I want Y”, with Y being “to want X”
Consider a LW reader saying “I want to be a more rational reasoning agent”, when previously she did not (this does not fit “want to want”, but is also stating a potentially new element of a utility function potentially at odds with the the previous versions of the u.f.).
Could that reader be wrong about such? Or could there merely be a contradiction with the (consciously, how else) stated value versus other, contradictory values.
You’d say such a stated value can be wrong because it is merely an epistemic estimate of a value.
But why can you not introduce new values by wanting to want new values? Can you not (sorry) consciously try to modify your utility function at all? That would sound a bit fatalistic.
But why can you not introduce new values by wanting to want new values?
You can, it might be a bad idea (for some senses of “values”), and if you believe that you are doing that, it’s not necessarily true, even though it might be.
I’m not saying that it’s not possible to be correct, I’m saying that it’s possible to be mistaken, and in many situations where people claim to be correct about their values, there appears to be no reason to strongly expect that to be so, so they shouldn’t have that much certainty.
Good link. I agree with guarding against wrong epistemic estimates of values (good wording).
Our disagreement comes down to this (I think): “I want to want X” Is this
a) an epistemic estimate of a value
b) a value in itself, pattern matching “I want Y”, with Y being “to want X”
Consider a LW reader saying “I want to be a more rational reasoning agent”, when previously she did not (this does not fit “want to want”, but is also stating a potentially new element of a utility function potentially at odds with the the previous versions of the u.f.).
Could that reader be wrong about such? Or could there merely be a contradiction with the (consciously, how else) stated value versus other, contradictory values.
You’d say such a stated value can be wrong because it is merely an epistemic estimate of a value.
But why can you not introduce new values by wanting to want new values? Can you not (sorry) consciously try to modify your utility function at all? That would sound a bit fatalistic.
You can, it might be a bad idea (for some senses of “values”), and if you believe that you are doing that, it’s not necessarily true, even though it might be.
I’m not saying that it’s not possible to be correct, I’m saying that it’s possible to be mistaken, and in many situations where people claim to be correct about their values, there appears to be no reason to strongly expect that to be so, so they shouldn’t have that much certainty.