What would be wrong with an AI based on our revealed preferences? It sounds like an easy question, but somehow I’m having a hard time coming up with an answer.
Because my revealed preferences suck. The difference between even what I want in a sort of ordinary and non-transhumanist way and what I have is enormous. I am 150 pounds heavier than I want to be. My revealed preference is to eat regardless of health/size consequences, but I don’t want all of the people in the future to be fat. My revealed preference is also to kill people in pooristan so that I can have cheap plastic widgets or food or whatever. I don’t want an extrapolation of my akrasiatic actual actions controlling the future of the universe. I suspect the same goes for you.
Hmm. Let’s look more closely at the weight example, because the others are similar. You also reveal some degree of preference to be thin rather than fat, do you? Then an AI with unlimited power could satisfy both your desire to eat and your desire to be thin. And if the AI has limited power, do you really want it to starve you, rather than go with your revealed preference?
Revealed preference means what your actual actions are. It doesn’t have anything at all to do with what I verbally say my goals are. I can say that I would prefer to be thin all I want, but that isn’t my revealed preference. My revealed preference is to be fat, because, you know, that’s how I’m acting. You seem to be suffering some misapprehensions as to what you are saying about how an AI should act. If your definition of revealed preference contains my desire not to be fat, you should shift to what I mean when I talk about preference, because yours solves none of the problems you think it does.
I’m assuming that you revealed your preference to be thin in your other actions, at some other moments of your life. Pretty hard to believe that’s not the case.
At this point, I think I can provide a definitive answer to your earlier question, and it is … wait for it … “It depends on what you mean by revealed preference.” (Raise your hand if you saw that one coming! I’ll be here all week, folks!)
Specifically: if the AI is to do the “right thing,” then it has to get its information about “rightness” from somewhere, and given that moral realism is false (or however you want to talk about it), that information is going to have to come from humans, whether by scanning our brains directly or just superintelligently analyzing our behavior. Whether you call this revealed preference or Friendliness doesn’t matter; the technical challenge remains the same.
One argument against using the term revealed preference in this context is that the way the term gets used in economics fails to capture some of the key subtleties of the superintelligence problem. We want the AI to preserve all the things we care about, not just the most conspicuous things. We want it to consider not just that Lucas ate this-and-such, but also that he regretted it afterwards, where it should be stressed that regret is not any less real of a phenomenon than eating is. But because economists often use their models to study big public things like the trade of money for goods and services, in the popular imagination, economic concepts are associated with those kinds of big public things, and not small private things like feeling regretful—even though you could make a case that the underlying decision-theoretic principles are actually general enough to cover everything.
If the math only says to maximize u(x) subject to x dot p equals y, there’s no reason things like ethical concerns or the wish to be a better person can’t be part of the x_i or p_j, but because most people think economics is about money, they’re less likely to realize this when you say revealed preference. They’ll object, “Oh, but what about the time I did this-and-such, but I wish I were the sort of person that did such-and-that?” You could say, “Well, you revealed your preference to do such-and-that in your other actions, at some other moments of your life,” or you could just choose a different word. Again, I’m not sure it matters.
What would be wrong with an AI based on our revealed preferences?
What AI is based on is what determines the way the world will actually be, so by building an AI with given preference, you are inevitably answering my question about what to do with the world. It’s wrong to use revealed preference for AI to the same extent revealed preference gives the wrong answer to my question. You seem to agree that the correct answer to my question has little to do with revealed preference. This seems to be the same as seeing revealed preference a wrong thing to imprint AI with.
What would be wrong with an AI based on our revealed preferences? It sounds like an easy question, but somehow I’m having a hard time coming up with an answer.
Because my revealed preferences suck. The difference between even what I want in a sort of ordinary and non-transhumanist way and what I have is enormous. I am 150 pounds heavier than I want to be. My revealed preference is to eat regardless of health/size consequences, but I don’t want all of the people in the future to be fat. My revealed preference is also to kill people in pooristan so that I can have cheap plastic widgets or food or whatever. I don’t want an extrapolation of my akrasiatic actual actions controlling the future of the universe. I suspect the same goes for you.
Hmm. Let’s look more closely at the weight example, because the others are similar. You also reveal some degree of preference to be thin rather than fat, do you? Then an AI with unlimited power could satisfy both your desire to eat and your desire to be thin. And if the AI has limited power, do you really want it to starve you, rather than go with your revealed preference?
Revealed preference means what your actual actions are. It doesn’t have anything at all to do with what I verbally say my goals are. I can say that I would prefer to be thin all I want, but that isn’t my revealed preference. My revealed preference is to be fat, because, you know, that’s how I’m acting. You seem to be suffering some misapprehensions as to what you are saying about how an AI should act. If your definition of revealed preference contains my desire not to be fat, you should shift to what I mean when I talk about preference, because yours solves none of the problems you think it does.
Is your revealed preference to be fat, or is it to eat and exercise (or not exercise) in ways which incidentally result in your being fat?
I’m assuming that you revealed your preference to be thin in your other actions, at some other moments of your life. Pretty hard to believe that’s not the case.
At this point, I think I can provide a definitive answer to your earlier question, and it is … wait for it … “It depends on what you mean by revealed preference.” (Raise your hand if you saw that one coming! I’ll be here all week, folks!)
Specifically: if the AI is to do the “right thing,” then it has to get its information about “rightness” from somewhere, and given that moral realism is false (or however you want to talk about it), that information is going to have to come from humans, whether by scanning our brains directly or just superintelligently analyzing our behavior. Whether you call this revealed preference or Friendliness doesn’t matter; the technical challenge remains the same.
One argument against using the term revealed preference in this context is that the way the term gets used in economics fails to capture some of the key subtleties of the superintelligence problem. We want the AI to preserve all the things we care about, not just the most conspicuous things. We want it to consider not just that Lucas ate this-and-such, but also that he regretted it afterwards, where it should be stressed that regret is not any less real of a phenomenon than eating is. But because economists often use their models to study big public things like the trade of money for goods and services, in the popular imagination, economic concepts are associated with those kinds of big public things, and not small private things like feeling regretful—even though you could make a case that the underlying decision-theoretic principles are actually general enough to cover everything.
If the math only says to maximize u(x) subject to x dot p equals y, there’s no reason things like ethical concerns or the wish to be a better person can’t be part of the x_i or p_j, but because most people think economics is about money, they’re less likely to realize this when you say revealed preference. They’ll object, “Oh, but what about the time I did this-and-such, but I wish I were the sort of person that did such-and-that?” You could say, “Well, you revealed your preference to do such-and-that in your other actions, at some other moments of your life,” or you could just choose a different word. Again, I’m not sure it matters.
What AI is based on is what determines the way the world will actually be, so by building an AI with given preference, you are inevitably answering my question about what to do with the world. It’s wrong to use revealed preference for AI to the same extent revealed preference gives the wrong answer to my question. You seem to agree that the correct answer to my question has little to do with revealed preference. This seems to be the same as seeing revealed preference a wrong thing to imprint AI with.