Zack_M_Davis comments on Be a Visiting Fellow at the Singularity Institute

Zack_M_Davis 23 May 2010 6:48 UTC
4 points
0
At this point, I think I can provide a definitive answer to your earlier question, and it is … wait for it … “It depends on what you mean by revealed preference.” (Raise your hand if you saw that one coming! I’ll be here all week, folks!)

Specifically: if the AI is to do the “right thing,” then it has to get its information about “rightness” from somewhere, and given that moral realism is false (or however you want to talk about it), that information is going to have to come from humans, whether by scanning our brains directly or just superintelligently analyzing our behavior. Whether you call this revealed preference or Friendliness doesn’t matter; the technical challenge remains the same.

One argument against using the term revealed preference in this context is that the way the term gets used in economics fails to capture some of the key subtleties of the superintelligence problem. We want the AI to preserve all the things we care about, not just the most conspicuous things. We want it to consider not just that Lucas ate this-and-such, but also that he regretted it afterwards, where it should be stressed that regret is not any less real of a phenomenon than eating is. But because economists often use their models to study big public things like the trade of money for goods and services, in the popular imagination, economic concepts are associated with those kinds of big public things, and not small private things like feeling regretful—even though you could make a case that the underlying decision-theoretic principles are actually general enough to cover everything.

If the math only says to maximize u(x) subject to x dot p equals y, there’s no reason things like ethical concerns or the wish to be a better person can’t be part of the x_i or p_j, but because most people think economics is about money, they’re less likely to realize this when you say revealed preference. They’ll object, “Oh, but what about the time I did this-and-such, but I wish I were the sort of person that did such-and-that?” You could say, “Well, you revealed your preference to do such-and-that in your other actions, at some other moments of your life,” or you could just choose a different word. Again, I’m not sure it matters.