Your last paragraph suggests that you’ve misunderstood my view. I’m not making an empirical claim to the effect that all agents will eventually converge to our values—I agree that that’s obviously false. I don’t even think that all formally intelligent agents are guaranteed to have normative concepts like ‘ought’, ‘reason’, or ‘morality’. The claim is just that such a radically different agent could share our normative concepts (in particular, our aspiration to a mind-independent standard), even if they would radically disagree with us about which things fall under the concept. We could both have full empirical knowledge about our own and each other’s desires/dispositions, and yet one (or both) of us might be wrong about what we really have reason to want and to do.
(Aside: the further claim about “reasons” in your last sentence presupposes a subjectivist view about reasons that I reject.)
What use is this concept of “reasonability”? Let’s say I build an agent that wants to write the first 1000 Fibonacci numbers in mile-high digits on the Moon, except skipping the 137th one. When you start explaining to the agent that it’s an “arbitrary omission” and it “should” amend its desires for greater “consistency”, the agent just waves you off because listening to you isn’t likely to further its current goals. Listening to you is not rational for the agent in the sense that most people on LW use the term: it doesn’t increase expected utility. If by “rational” you mean something else, I’d like to understand what exactly.
I mean ‘rational’ in the ordinary, indefinable sense, whereby calling a decision ‘irrational’ expresses a distinctive kind of criticism—similar to that expressed by the words ‘crazy’, ‘foolish’, ‘unwise’, etc. (By contrast, you can just say “maximizes expected utility” if you really mean nothing more than maximizes expected utility—but note that that’s a merely descriptive concept, not a normative one.)
If you don’t possess this concept—if you never have thoughts about what’s rational, over and above just what maximizes expected utility—then I can’t help you.
When we’re trying to reduce intuitions, there’s no helping starting from informal ideas. Another question is that it’s not proper to stop there, but Richard doesn’t exactly suggest that.
A more salient to me question is, What are suggestions about redefining this “rational” intuitive idea good for, if it’s supposed to be the source material? Such question even explains how the idea of considering, say, “consciousness”, in a more precise sense is methodologically a step in the wrong direction: when words are the data you work with, you should be careful to assign new words to new ideas used for analyzing them.
I’m not sure I understand your second paragraph. Are you suggesting that if we come up with a new theory to explain some aspect of consciousness, we should use a word other than “consciousness” in that theory, to avoid potentially losing some of our intuitions about consciousness?
Yeah, as Vladimir guessed, this is all familiar.
Your last paragraph suggests that you’ve misunderstood my view. I’m not making an empirical claim to the effect that all agents will eventually converge to our values—I agree that that’s obviously false. I don’t even think that all formally intelligent agents are guaranteed to have normative concepts like ‘ought’, ‘reason’, or ‘morality’. The claim is just that such a radically different agent could share our normative concepts (in particular, our aspiration to a mind-independent standard), even if they would radically disagree with us about which things fall under the concept. We could both have full empirical knowledge about our own and each other’s desires/dispositions, and yet one (or both) of us might be wrong about what we really have reason to want and to do.
(Aside: the further claim about “reasons” in your last sentence presupposes a subjectivist view about reasons that I reject.)
What use is this concept of “reasonability”? Let’s say I build an agent that wants to write the first 1000 Fibonacci numbers in mile-high digits on the Moon, except skipping the 137th one. When you start explaining to the agent that it’s an “arbitrary omission” and it “should” amend its desires for greater “consistency”, the agent just waves you off because listening to you isn’t likely to further its current goals. Listening to you is not rational for the agent in the sense that most people on LW use the term: it doesn’t increase expected utility. If by “rational” you mean something else, I’d like to understand what exactly.
I mean ‘rational’ in the ordinary, indefinable sense, whereby calling a decision ‘irrational’ expresses a distinctive kind of criticism—similar to that expressed by the words ‘crazy’, ‘foolish’, ‘unwise’, etc. (By contrast, you can just say “maximizes expected utility” if you really mean nothing more than maximizes expected utility—but note that that’s a merely descriptive concept, not a normative one.)
If you don’t possess this concept—if you never have thoughts about what’s rational, over and above just what maximizes expected utility—then I can’t help you.
I don’t think we can make progress with such imprecise thinking. Eliezer has a nice post about that.
When we’re trying to reduce intuitions, there’s no helping starting from informal ideas. Another question is that it’s not proper to stop there, but Richard doesn’t exactly suggest that.
A more salient to me question is, What are suggestions about redefining this “rational” intuitive idea good for, if it’s supposed to be the source material? Such question even explains how the idea of considering, say, “consciousness”, in a more precise sense is methodologically a step in the wrong direction: when words are the data you work with, you should be careful to assign new words to new ideas used for analyzing them.
In this old comment, he does seem to suggest stopping there.
I’m not sure I understand your second paragraph. Are you suggesting that if we come up with a new theory to explain some aspect of consciousness, we should use a word other than “consciousness” in that theory, to avoid potentially losing some of our intuitions about consciousness?
Yes.