This seems to confuse stuff that happens to a human with decision theory. What happens with a human (in human’s thoughts, etc.) can’t be “contradictory” apart from a specific interpretation that names some things “contradictory”. This interpretation isn’t fundamentally interesting for the purposes of optimizing the stuff. The ontology problem is asked about the FAI, not about a person that is optimized by FAI. For FAI, a person is just a pattern in the environment, just like any other object, with stars and people and paperclips all fundamentally alike; the only thing that distinguishes them for FAI is what preference tells should be done in each case.
When we are talking about decision theory for FAI, especially while boxing the ontology inside the FAI, it’s not obvious how to connect that with particular interpretations of what happens in environment, nor should we try, really.
Now, speaking of people in environment, we might say that the theist is going to feel frustrated for some time upon realizing that they were confused for a long time. However I can’t imagine the whole process of deconverting to be actually not preferable, as compared to remaining confused (especially given that in the long run, the person will need to grow up). Even the optimal strategy is going to have identifiable negative aspects, but it may only make the strategy suboptimal if there is a better way. Also, for a lot of obvious negative aspects, such as negative emotions accompanying an otherwise desirable transition, FAI is going to invent a way of avoiding that aspect, if that’s desirable.
the only thing that distinguishes them for FAI is what preference tells should be done in each case.
And that the person might be the source of preference. This is fairly important. But, in any case, FAI theory is only here as an intuition pump for evaluating “what would the best thing be, according to this person’s implicit preferences?”
If it is possible to have preference-like things within a fundamentally contradictory belief system, and that’s all the human in question has, then knowing about the inconsistency might be bad.
And that the person might be the source of preference. This is fairly important.
This is actually wrong. Whatever the AI starts with is its formal preference, it never changes, it never depends on anything. That this formal preference was actually intended to copycat an existing pattern in environment is a statement about what sorts of formal preference it is, but it is enacted the same way, in accordance with what should be done in that particular case based on what formal preference tells. Thus, what you’ve highlighted in the quote is a special case, not an additional feature. Also, I doubt it can work this way.
But, in any case, FAI theory is only here as an intuition pump for evaluating “what would the best thing be, according to this person’s implicit preferences?”
True, but implicit preference is not something that person realizes to be preferable, and not something expressed in terms of confused “ontology” believed by that person. The implicit preference is a formal object that isn’t built from fuzzy patterns interpreted in the person’s thoughts. When you speak of “contradictions” in person’t beliefs, you are speaking on a wrong level of abstraction, like if you were discussing parameters in a clustering algorithm as being relevant to reliable performance of hardware on which that algorithm runs.
If it is possible to have preference-like things within a fundamentally contradictory belief system, and that’s all the human in question has, then knowing about the inconsistency might be bad.
A belief system can’t be “fundamentally contradictory” because it’s not “fundamental” to begin with. What do you mean by “bad”? Bad according to what? It doesn’t follow from confused thoughts that preference is somehow brittle.
This seems to confuse stuff that happens to a human with decision theory. What happens with a human (in human’s thoughts, etc.) can’t be “contradictory” apart from a specific interpretation that names some things “contradictory”. This interpretation isn’t fundamentally interesting for the purposes of optimizing the stuff. The ontology problem is asked about the FAI, not about a person that is optimized by FAI. For FAI, a person is just a pattern in the environment, just like any other object, with stars and people and paperclips all fundamentally alike; the only thing that distinguishes them for FAI is what preference tells should be done in each case.
When we are talking about decision theory for FAI, especially while boxing the ontology inside the FAI, it’s not obvious how to connect that with particular interpretations of what happens in environment, nor should we try, really.
Now, speaking of people in environment, we might say that the theist is going to feel frustrated for some time upon realizing that they were confused for a long time. However I can’t imagine the whole process of deconverting to be actually not preferable, as compared to remaining confused (especially given that in the long run, the person will need to grow up). Even the optimal strategy is going to have identifiable negative aspects, but it may only make the strategy suboptimal if there is a better way. Also, for a lot of obvious negative aspects, such as negative emotions accompanying an otherwise desirable transition, FAI is going to invent a way of avoiding that aspect, if that’s desirable.
And that the person might be the source of preference. This is fairly important. But, in any case, FAI theory is only here as an intuition pump for evaluating “what would the best thing be, according to this person’s implicit preferences?”
If it is possible to have preference-like things within a fundamentally contradictory belief system, and that’s all the human in question has, then knowing about the inconsistency might be bad.
This is actually wrong. Whatever the AI starts with is its formal preference, it never changes, it never depends on anything. That this formal preference was actually intended to copycat an existing pattern in environment is a statement about what sorts of formal preference it is, but it is enacted the same way, in accordance with what should be done in that particular case based on what formal preference tells. Thus, what you’ve highlighted in the quote is a special case, not an additional feature. Also, I doubt it can work this way.
True, but implicit preference is not something that person realizes to be preferable, and not something expressed in terms of confused “ontology” believed by that person. The implicit preference is a formal object that isn’t built from fuzzy patterns interpreted in the person’s thoughts. When you speak of “contradictions” in person’t beliefs, you are speaking on a wrong level of abstraction, like if you were discussing parameters in a clustering algorithm as being relevant to reliable performance of hardware on which that algorithm runs.
A belief system can’t be “fundamentally contradictory” because it’s not “fundamental” to begin with. What do you mean by “bad”? Bad according to what? It doesn’t follow from confused thoughts that preference is somehow brittle.