Vladimir_Nesov comments on Open Thread: February 2010, part 2

Vladimir_Nesov 21 Feb 2010 10:35 UTC
3 points

Sure, we favor the particular Should Function that is, today, instantiated in the brains of roughly middle-of-the-range-politically intelligent westerners.

Do you think there is no simple procedure that would find roughly the same “should function” hidden somewhere in the brain of a brain-washed blood-thirsty religious zealot? It doesn’t need to be what the person believes, what the person would recognize as valuable, etc., just something extractable from the person, according to a criterion that might be very alien to their conscious mind. Not all opinions (beliefs/likes) are equal, and I wouldn’t want to get stuck with wrong optimization-criterion just because I happened to be born in the wrong place and didn’t (yet!) get the chance to learn more about the world.

(I’m avoiding the term ‘preference’ to remove connotations I expect it to have for you, for what I consider the wrong reasons.)
- Roko 21 Feb 2010 13:20 UTC
  2 points
  Parent
  A lot of people seem to want to have their cake and eat it with CEV. Haidt has shown us that human morality is universal in form and local in content, and has gone on to do case studies showing that there are 5 basic human moral dimensions (harm/care, justice/fairness, loyalty/ingroup, respect/authority, purity/sacredness), and our culture only has the first two.
  
  It seems that there is no way you can run an honestly moraly neutral CEV of all of humanity and expect to reliably get something you want. You can either rig CEV so that it tweaks people who don’t share our moral drives, or you can just cross your fingers and hope that the process of extrapolation causes convergence to our idealized preferences, and if you’re wrong you’ll find yourself in a future that is suboptimal.
  - CarlShulman 22 Feb 2010 10:14 UTC
    1 point
    Parent
    Haidt just claims that the relative balance of those five clusters differ across cultures, they’re present in all.
  - Vladimir_Nesov 21 Feb 2010 14:17 UTC
    1 point
    Parent
    On one hand, using preference-aggregation is supposed to give you the outcome preferred by you to a lesser extent than if you just started from yourself. On the other hand, CEV is not “morally neutral”. (Or at least, the extent to which preference is given in CEV implicitly has nothing to do with preference-aggregation.)
    
    We have a tradeoff between the number of people to include in preference-aggregation and value-to-you of the outcome. So, this is a situation to use the reversal test. If you consider only including the smart sane westerners as preferable to including all presently alive folks, then you need to have a good argument why you won’t want to exclude some of the smart sane westerners as well, up to a point of only leaving yourself.
    - Roko 21 Feb 2010 16:47 UTC
      2 points
      Parent
      Yes, a CEV of only yourself is, by definition optimal.
      
      The reason I don’t recommend you try it is because it is infeasible; probability of success is very low, and by including a bunch of people who (you have good reason to think) are a lot like you, you will eventually reach the optimal point in the tradeoff between quality of outcome and probability of success.
      - Unknowns 24 Feb 2010 4:59 UTC
        5 points
        Parent
        I hope you realize that you are in flat disagreement with Eliezer about this. He explicitly affirmed that running CEV on himself alone, if he had the chance to do it, would be wrong.
        Eliezer Yudkowsky 24 Feb 2010 5:41 UTC
        2 points
        Parent
        Confirmed.
        wedrifid 24 Feb 2010 6:29 UTC
        1 point
        Parent
        Eliezer quite possibly does believe that. That he can make that claim with some credibility is one of the reasons I am less inclined to use my resources to thwart Eliezer’s plans for future light cone domination.
        
        Nevertheless, Roko is right more or less by definition and I lend my own flat disagreement to his.
      - Vladimir_Nesov 21 Feb 2010 17:15 UTC
        2 points
        Parent
        “Low probability of success” should of course include game-theoretic considerations where people are more willing to help you if you give more weight to their preference (and should refuse to help you if you give them too little, even if it’s much more than status quo, as in Ultimatum game). As a rule, in Ultimatum game you should give away more if you lose from giving it away less. When you lose value to other people in exchange to their help, having compatible preferences doesn’t necessarily significantly alleviate this loss.
        Roko 21 Feb 2010 17:28 UTC
        0 points
        Parent
        Sorry, I don’t follow this: can you restate?
        
        having compatible preferences doesn’t necessarily significantly alleviate this loss.
        
        I know about the ultimatum game, but it is game-theoretically interesting precisely because the players have different preferences: I want all the money for me, you want all of it for you.
        Vladimir_Nesov 21 Feb 2010 18:56 UTC
        2 points
        Parent
        
        I know about the ultimatum game, but it is game-theoretically interesting precisely because the players have different preferences: I want all the money for me, you want all of it for you.
        
        Ultimatum game was mentioned primarily to remind that the amount of FAI-value traded for assistance may be orders of magnitude greater than what the assistance feels to amount to.
        
        We might as well have as a given that all the discussed values are (at least to some small extent) different. The “all of money” here are the points of disagreement, mutually exclusive features of the future. But you are not trading value for value. You are trading value-after-FAI for assistance-now.
        
        If two people compete for providing you an equivalent amount of assistance, you should be indifferent between them in accepting this assistance, which means that it should cost you an equivalent amount of value. If Person A has preference close to yours, and Person B has preference distant from yours, then by losing the same amount of value, you can help Person A more than Person B. Thus, if we assume egalitarian “background assistance”, provided implicitly by e.g. not revolting and stopping the FAI programmer, then everyone still can get a slice of the pie, no matter how distant their values. If nothing else, the more alien people should strive to help you more, so that you’ll be willing to part with more value for them (marginal value of providing assistance is greater for distant-preference folks).
        Roko 21 Feb 2010 20:21 UTC
        3 points
        Parent
        Thanks for the explanation.
        
        FAI-value traded for assistance may be orders of magnitude greater than what the assistance feels to amount to.
        
        Another way to put this is that when people negotiate, they do best, all other things equal, if they try to drive a very hard bargain. If me and my neighbour Claire are both from roughly the same culture, upbringing, etc, and we are together going to build an AI which will extrapolate a combination of our volitions, Claire might do well to demand a 99% weighting to her volitions, and maybe I’ll bargain her up to 90% or something.
        
        Bob the babyeater might offer me the same help that Claire could have given in exchange for just a 1% weighting of his volition, by the principle that I am making the same sacrifice in giving 99% of the CEV to Claire as in giving 1% to Bob.
        
        In reality, however, humans tend to live and work with people that are like them, rather than people who are unlike them. And the world we live in doesn’t have a uniform distribution of power and knowledge across cultures.
        
        If nothing else, the more alien people should strive to help you more, so that you’ll be willing to part with more value for them
        
        Many “alien” cultures are too powerless compared to ours to do anything. The However, China and India are potential exceptions. The USA and China may end up in a dictator game over FAI motivations.
        
        All I am saying is that the egalitarian desire to include all of humanity in CEV, each with equal weight, is not optimal. Yes dictator game/negotiation with China, yes dictator game/negotiation within US/EU/western block.
        
        Excluding a group from the CEV doesn’t mean disenfranchising them. It means enfranchising them according to your definition of enfranchisement. Cultures in North Africa that genitally mutilate women should not be included in CEV, but I predict that my CEV would treat their culture with respect and dignity, including in some cases interfering to prevent them from using their share of the light-cone to commit extreme acts of torture or oppression.
        Vladimir_Nesov 21 Feb 2010 21:06 UTC
        3 points
        Parent
        You don’t include cultures in CEV, you filter people through extrapolation of their volition. Even if culture makes value different, “mutilating women” is not a kind of thing that gets through, and so is a broken prototype example for drawing attention to.
        
        In any case, my argument in the above comment was that value should be given (theoretically, if everyone understands the deal and relevant game theory, etc., etc.; realistically, such a deal must be simplified; you may even get away with cheating) according to provided assistance, not according to compatibility of value. If poor compatibility of value prevents from giving assistance, this is an effect of value completely unrelated to post-FAI compatibility, and given that assistance can be given with money, the effect itself doesn’t seem real either. You may well exclude people of Myanmar, because they are poor and can’t affect your success, but not people of a generous/demanding genocidal cult, for an irrelevant reason that they are evil. Game theory is cynical.
        Roko 21 Feb 2010 23:10 UTC
        2 points
        Parent
        
        “mutilating women” is not a kind of thing that gets through,
        
        how do you know? If enough people want it strongly enough, it might.
        Vladimir_Nesov 21 Feb 2010 23:51 UTC
        2 points
        Parent
        
        how do you know? If enough people want it strongly enough, it might.
        
        How strongly people want something now doesn’t matter, reflection has the power to wipe current consensus clean. You are not cooking a mixture of wants, you are letting them fight it out, and a losing want doesn’t have to leave any residue. Only to the extent current wants might indicate extrapolated wants, should we take current wants into account.
        What links here?
        Vladimir_Nesov's comment on Open Thread: February 2010, part 2 by CronoDAS (23 Feb 2010 18:33 UTC; 3 points)
        Expand this thread
        Roko 22 Feb 2010 16:47 UTC
        2 points
        Parent
        
        You are not cooking a mixture of wants, you are letting them fight it out, and a losing want doesn’t have to leave any residue.
        
        Sure. And tolerance, gender equality, multiculturalism, personal freedoms, etc might lose in such a battle. An extrapolation that is more nonlinear in its inputs cuts both ways.
        Kevin 21 Feb 2010 23:28 UTC
        −1 points
        Parent
        Might “mutilating men” make it through?
        
        (sorry for the euphemism, I mean male circumcision)
- Roko 21 Feb 2010 13:04 UTC
  1 point
  Parent
  
  you think there is no simple procedure that would find roughly the same “should function” hidden somewhere in the brain of a brain-washed blood-thirsty religious zealot?
  
  Sure, the kolmogorov complexity of a set of edits to change the moral reflective equilibrium of a human is probably pretty low compared to the complexity of the overall human preference set. But that works the other way around too. Somewhere hidden in the brain of a a liberal western person is a murderer/terrorist/child abuser/fundamentalist if you just perform the right set of edits.
  - Vladimir_Nesov 21 Feb 2010 14:04 UTC
    2 points
    Parent
    
    But that works the other way around too. Somewhere hidden in the brain of a a liberal western person is a murderer/terrorist/child abuser/fundamentalist if you just perform the right set of edits.
    
    Again, not all beliefs are equal. You don’t want to use the procedure that’ll find a murderer in yourself, you want to use the procedure that’ll find a nice fellow in a murderer. And given such a procedure, you won’t need to exclude murderers from extrapolated volition.