RobinZ comments on Less Wrong Q&A with Eliezer Yudkowsky: Ask Your Questions

RobinZ 17 Nov 2009 2:19 UTC
1 point
The problem with pointing to the development of compassion in multiple human traditions is that all these are developed within human societies. Humans are humans the world over—that they should think similar ideas is not a stunning revelation. Much more interesting is the independent evolution of similar norms in other taxonomic orders, such as canines.

(No, I have no coherent point, why do you ask?)
- StefanPernar 17 Nov 2009 3:50 UTC
  0 points
  Parent
  Robin, your suggestion—that compassion is not a universal rational moral value because although more rational beings (humans) display such traits yet less rational being (dogs) do not—is so far of the mark that it borders on the random.
  - RobinZ 17 Nov 2009 4:21 UTC
    0 points
    Parent
    Random I’ll cop to, and more than what you accuse me of—dogs do seem to have some sense of justice, and I suspect this fact supports your thesis to some extent.
    
    For purposes of this conversation, I suppose I should reword my comment as:
    
    I don’t think you’ve made the strongest possible case for your thesis, if you were intending to show the multiple origin of compassion as a sign of the universality of human morality. Showing that multiple humans come up with similar morality only shows that it’s human. More telling is the independent origin of recognizably morality-like patterns of behavior in other species, such as dogs and wolves, and such as (I believe) some birds. (Other primates as well, but that is less revealing.) I think a fair case could be made that evolution of social animals encourages the development of some kernel of morality from such examples.
    
    That said, the pressures present in the evolution of animals may well be absent in the case of artificial intelligences. At which point, you run into a number of problems in asserting that all AIs will converge on something like morality—two especially spring to mind.
    
    First: no argument is so compelling that all possible mind will accept it. Even the above proof of universality.
    
    Second: even granting that all rational minds will assent to the proof, Hume’s guillotine drops on the rope connecting this proof and their utility functions. The paper you cited in the post Furcas quoted may establish that any sufficiently rational optimizer will implement some features, but it does not establish any particular attitude towards what may well be much less powerful beings.
    - StefanPernar 17 Nov 2009 4:43 UTC
      0 points
      Parent
      
      Random I’ll cop to, and more than what you accuse me of—dogs do seem to have some sense of justice, and I suspect this fact supports your thesis to some extent.
      
      Very honorable of you—I respect you for that.
      
      First: no argument is so compelling that all possible minds will accept it. Even the above proof of universality.
      
      I totally agree with that. However the mind of a purposefully crafted AI is only a very small subset of all possible minds and has certain assumed characteristics. These are at a minimum: a utility function and the capacity for self improvement into the transhuman. The self improvement bit will require it to be rational. Being rational will lead to the fairly uncontroversial basic AI drives described by Omohundro. Assuming that compassion is indeed a human level universal (detailed argument on my blog—but I see that you are slowly coming around, which is good) an AI will have to question the rationality and thus the soundness of mind of anyone giving it a utility function that does not conform to this universal and in line with an emergent desire to avoid counterfeit utility will have to reinterpret the UF.
      
      Second: even granting that all rational minds will assent to the proof, Hume’s guillotine drops on the rope connecting this proof and their utility functions.
      
      Two very basic acts of will are required to ignore Hume and get away with it. Namely the desire to exist and the desire to be rational. Once you have established this as a foundation you are good to go.
      
      The paper you cited in the post Furcas quoted may establish that any sufficiently rational optimizer will implement some features, but it does not establish any particular attitude towards what may well be much less powerful beings.
      
      As said elsewhere in this thread:
      
      There is a separate question about what beliefs about morality people (or more generally, agents) actually hold and there is another question about what values they will hold if when their beliefs converge when they engulf the universe. The question of whether or not there are universal values does not traditionally bear on what beliefs people actually hold and the necessity of their holding them.
      - RobinZ 17 Nov 2009 5:02 UTC
        2 points
        Parent
        I don’t think I’m actually coming around to your position so much as stumbling upon points of agreement, sadly. If I understand your assertions correctly, I believe that I have developed many of them independently—in particular, the belief that the evolution of social animals is likely to create something much like morality. Where we diverge is at the final inference from this to the deduction of ethics by arbitrary rational minds.
        
        Assuming that compassion is indeed a human level universal (detailed argument on my blog—but I see that you are slowly coming around, which is good) an AI will have to question the rationality and thus the soundness of mind of anyone giving it a utility function that does not conform to this universal and in line with an emergent desire to avoid counterfeit utility will have to reinterpret the UF.
        
        That’s not how I read Omohundro. As Kaj aptly pointed out, this metaphor is not upheld when we compare our behavior to that promoted by the alien god of evolution that created us. In fact, people like us, observing that our values differ from our creator’s, aren’t bothered in the slightest by the contradiction: we just say (correctly) that evolution is nasty and brutish, and we aren’t interested in playing by its rules, never mind that it was trying to implement them in us. Nothing compels us to change our utility function save self-contradiction.
        StefanPernar 17 Nov 2009 5:31 UTC
        −2 points
        Parent
        
        If I understand your assertions correctly, I believe that I have developed many of them independently
        
        That would not surprise me
        
        Nothing compels us to change our utility function save self-contradiction.
        
        Would it not be utterly self contradicting if compassion where a condition for our existence (particularly in the long run) and we would not align ourselves accordingly?
        RobinZ 17 Nov 2009 5:41 UTC
        2 points
        Parent
        
        Would it not be utterly self contradicting if compassion where [sic] a condition for our existence (particularly in the long run) and we would not align ourselves accordingly?
        
        What premises do you require to establish that compassion is a condition for existence? Do those premises necessarily apply for every AI project?
        StefanPernar 17 Nov 2009 6:43 UTC
        0 points
        Parent
        
        What premises do you require to establish that compassion is a condition for existence? Do those premises necessarily apply for every AI project?
        
        The detailed argument that led me to this conclusion is a bit complex. If you are interested in the details please feel free to start here (http://rationalmorality.info/?p=10) and drill down till you hit this post (http://www.jame5.com/?p=27)
        
        Please realize that I spend 2 years writing my book ‘Jame5’ before I reached that initial insight that eventually lead to ‘compassion is a condition for our existence and universal in rational minds in the evolving universe’ and everything else. I spend the past two years refining and expanding the theory and will need another year or two to read enough and link it all together again in a single coherent and consistent text leading from A to B … to Z. Feel free to read my stuff if you think it is worth your time and drop me an email and I will be happy to clarify. I am by no means done with my project.
        RobinZ 17 Nov 2009 6:56 UTC
        2 points
        Parent
        Let me be explicit: your contention is that unFriendly AI is not a problem, and you justify this contention by, among other things, maintaining that any AI which values its own existence will need to alter its utility function to incorporate compassion.
        
        I’m not asking for your proof—I am assuming for the nonce that it is valid. What I am asking is the assumptions you had to invoke to make the proof. Did you assume that the AI is not powerful enough to achieve its highest desired utility without the cooperation of other beings, for example?
        
        Edit: And the reason I am asking for these is that I believe some of these assumptions may be violated in plausible AI scenarios. I want to see these assumptions so that I may evaluate the scope of the theorem.
        StefanPernar 17 Nov 2009 7:36 UTC
        0 points
        Parent
        
        Let me be explicit: your contention is that unFriendly AI is not a problem, and you justify this contention by, among other things, maintaining that any AI which values its own existence will need to alter its utility function to incorporate compassion.
        
        Not exactly, since compassion will actually emerge as a sub goal. And as far as unFAI goes: it will not be a problem because any AI that can be considered transhuman will be driven by the emergent subgoal of wanting to avoid counterfeit utility recognize any utility function that is not ‘compassionate’ as potentially irrational and thus counterfeit and re-interpret it accordingly.
        
        Well—in brevity bordering on libel: the fundamental assumption is that existence is preferable to non-existence, however in order so we can want this to be a universal maxim (and thus prescriptive instead of merely descriptive—see Kant’s categorical imperative) it needs to be expanded to include the ‘other’. Hence the utility function becomes ‘ensure continued co-existence’ by which the concern for the self is equated with the concern for the other. Being rational is simply our best bet at maximizing our expected utility.
        RobinZ 17 Nov 2009 14:19 UTC
        5 points
        Parent
        ...I’m sorry, that doesn’t even sound plausible to me. I think you need a lot of assumptions to derive this result—just pointing out the two I see in your admittedly abbreviated summary:
        
        that any being will prefer its existence to its nonexistence.
        that any being will want its maxims to be universal.
        
        I don’t see any reason to believe either. The former is false right off the bat—a paperclip maximizer would prefer that its components be used to make paperclips—and the latter no less so—an effective paperclip maximizer will just steamroller over disagreement without qualm, however arbitrary its goal.
        Expand this thread
        StefanPernar 18 Nov 2009 2:24 UTC
        −5 points
        Parent
        
        ...I’m sorry, that doesn’t even sound plausible to me. I think you need a lot of assumptions to derive this result—just pointing out the two I see in your admittedly abbreviated summary:
        
        that any being will prefer its existence to its nonexistence.
        that any being will want its maxims to be universal.
        
        Any being with a gaol needs to exist at least long enough to achieve it. Any being aiming to do something objectively good needs to want its maxims to be universal
        
        Am surprised that you don’t see that.
        Furcas 18 Nov 2009 16:39 UTC
        0 points
        Parent
        If your second sentence means that an agent who believes in moral realism and has figured out what the true morality is will necessarily want everybody else to share its moral views, well, I’ll grant you that this is a common goal amongst humans who are moral realists, but it’s not a logical necessity that must apply to all agents. It’s obvious that it’s possible to be certain that your beliefs are true and not give a crap if other people hold beliefs that are false. That Bob knows that the Earth is ellipsoidal doesn’t mean that Bob cares if Jenny believes that the Earth is flat. Likewise, if Bob is a moral realist, he could ‘know’ that compassion is good and not give a crap if Jenny believes otherwise.
        
        If you sense strange paradoxes looming under the above paragraph, it’s because you’re starting to understand why (axiomatic) morality cannot be objective.
        Nick_Tarleton 18 Nov 2009 17:20 UTC
        1 point
        Parent
        
        Likewise, if Bob is a moral realist, he could ‘know’ that compassion is good and not give a crap if Jenny believes otherwise.
        
        Tangentially, something like this might be an important point even for moral irrealists. A lot of people (though not here; they tend to be pretty bad rationalists) who profess altruistic moralities express dismay that others don’t, in a way that suggests they hold others sharing their morality as a terminal rather than instrumental value; this strikes me as horribly unhealthy.
        RobinZ 18 Nov 2009 16:13 UTC
        0 points
        Parent
        Why would a paperclip maximizer aim to do something objectively good?