Kaj_Sotala comments on Making computer systems with extended Identity

Kaj_Sotala 8 Mar 2012 8:53 UTC
11 points
You sort of mentioned some of these points at the end of your essay, but just to spell them out more explicitly:

While I consider the individual cells in my body to be a part of me, I’d be ready to kill some of them (e.g. ones infected with a disease) if necessary. Likewise, I consider my arm a part of me, but I would be (albeit quite reluctantly) ready to have it amputated if that was needed to save the rest of the body.

These aren’t merely examples of self-preservation: if somebody developed cybernetic implants that were as good and better than my original body parts, I might be willing to swap.

Looking at a more abstract level, emotions such as phobias, fears and neuroses are also an important part of me, and they’re generated by parts of my brain which quite certainly are important parts of me. Yet I would mostly be just glad to be rid of these emotions. Although I would not want to get rid of the parts of my brain that generate them, I would like to have those parts substantially modified.

Simply having something designated as a part of yourself doesn’t mean that you’ll protect it. Even if it’s an important part of yourself and you do want to protect it, you might be willing to rebuild it entirely, in effect destroying the original and replacing it with something almost completely different. An AI that considered humanity an important part of itself might still be completely willing to replace humanity with robots of its own design, if it thought that it was upgrading a part of itself into something better that way.

What you actually want is to make the preservation of humanity important for the AI’s goals, for the kinds of definitions of “humanity” that we’d want the AI to consider “humanity”, and with goals that the correspond to what humanity wants to want. And then the problem reduces back to just defining goals that make the AI treat humans as we’d want it to treat us, with the (anthropomorphic) concept of identity becoming entirely redundant.
- whpearson 8 Mar 2012 12:04 UTC
  0 points
  Parent
  Personally I don’t consider uploading as preserving myself, as I extend “that which I wish to preserve” to my body, rather than just my computations. So we are going to clash somewhat.
  
  What you actually want is to make the preservation of humanity important for the AI’s goals, for the kinds of definitions of “humanity” that we’d want the AI to consider “humanity”, and with goals that the correspond to what humanity wants to want.
  
  I’d rather not have to define anything, due to ontological issues. Our brains generally do an okay job of fulfilling our biological imperative without having it explicitly defined. So engineer computer systems with a humanistic imperative. 1) we should be able to do a better job of it than evolution 2) with the society of AI’s idea to balance and punish those who stray to far from the humanistic (there are some examples in human society of pressure to be biologically normal, e.g. straight, non self mutilating etc).
  
  And then the problem reduces back to just defining goals that make the AI treat humans as we’d want it to treat us, with the (anthropomorphic) concept of identity becoming entirely redundant.
  
  Do you accept Omohondro’s drives? One of them being self-preservation. This needs some notion of self that is not anthropomorphic.
  - Kaj_Sotala 8 Mar 2012 14:31 UTC
    5 points
    Parent
    
    Our brains generally do an okay job of fulfilling our biological imperative without having it explicitly defined.
    
    “The biological imperative” is “survive, have offspring, and make sure your offspring do well”. That’s a vastly simpler goal than “create the kind of a society that humans would want to want to have”. (An AI that carries out the biological imperative is called a paperclipper.)
    
    Also, even if our brain don’t have an explicit definition for it, it’s still implicitly there. You can’t not define a goal for an AI—the only question is whether you do it explicitly or implicitly.
    
    2) with the society of AI’s idea to balance and punish those who stray to far from the humanistic (there are some examples in human society of pressure to be biologically normal, e.g. straight, non self mutilating etc).
    
    Postulating a society of AIs being in control, instead of a single AI being in control, wouldn’t seem to make the task of goal system design any easier. Now, instead of just having to figure out a way to make a single AI’s goals to be what we want, you have to design an entire society of AIs in a way that the goals the society ends up promoting overall will be what we want. Taking a society of AIs as the basic unit only makes things more complex.
    
    Do you accept Omohondro’s drives? One of them being self-preservation. This needs some notion of self that is not anthropomorphic.
    
    I accept them. Like I said in the comment above, the definition of a self in the context of the “self-preservation” drive will be based on the AI’s goals.
    
    Remember that the self-preservation drive is based on the simple notion that the AI wants its goals to be achieved, and if it is destroyed, then those goals (probably) cannot be achieved, since there’s nobody around who’d work to achieve them. Whether or not the AI itself survives isn’t actually relevant—what’s relevant is whether such AIs (or minds-in-general) survive which will continue to carry out the goals, regardless of whether or not those AIs happen to be “same” AI.
    
    If an AI has “maximize the amount of paperclips in the universe” as its goal, then the “self-preservation drive” corresponds to “protect agents which share this goal (and are the most capable of achieving it)”.
    
    If an AI has as its goal “maximize the amount of paperclips produced by this unit”, then the “this unit” that it will try to preserve will depend on how its programming (implicitly or explicitly) defines “this unit”… or so it would seem at first.
    
    If we taboo “this unit”, we get “maximize the amount of paperclips produced by [something that meets some specific criteria]”. To see the problem in this, consider that if we had an AI that had been built to “maximize the amount of paperclips produced by the children of its inventor”, that too would get taboo’d into “maximize the amount of paperclips produced by [something that meets some specific criteria]”. The self-preservation drive again collapses into “protect agents which share this goal (and are the most capable of achieving it)”: there’s no particular reason to use the term “self”.
    
    To put it differently: with regard to the self-preservation drive, there’s no difference in whether the goal is to “maximize paperclips produced by this AI” or “maximize the happiness of humanity”. In both cases, the AI is trying to do something to some entity which is defined in some specific way. In order for something to be done to such an entity, some agent must survive which has as its goal the task of doing such things to such an entity, and the AI will make sure that such agents try to survive.
    
    Of course it must also make sure that the target of the optimizing-activity survives, but that’s separate from the self-preservation drive.
    
    (I’m not sure of how clearly I’m expressing myself here—did folks understand what I was trying to say?)
    - whpearson 8 Mar 2012 15:20 UTC
      0 points
      Parent
      
      Also, even if our brain don’t have an explicit definition for it, it’s still implicitly there. You can’t not define a goal for an AI—the only question is whether you do it explicitly or implicitly.
      
      Can we make a system that has a human as (part of) an implicit definition of its goal system? When you allow implicit definitions they can be made non-spatially located, although some information will need to flow between them.
      - Kaj_Sotala 8 Mar 2012 15:36 UTC
        0 points
        Parent
        In principle? Sure.
        
        In practice? I have no idea.
        whpearson 8 Mar 2012 15:51 UTC
        0 points
        Parent
        I’m not sure if I am making myself clear. Just to check. I am interested in exploring systems where a human is an important computational component (not just pointed at) of an implicit goal system for an advanced computer system.
        
        Because the human part is implicit, the system might not make the correct inferences and judge the human to be important. If there was a society of these systems, then if we engineered things correctly then most of them would make the correct inference and judge the human an important part of their goal system and they may be able to exert pressure on those that didn’t.
        
        Does that make more sense?
        Kaj_Sotala 9 Mar 2012 13:22 UTC
        0 points
        Parent
        Okay. I thought that you meant something like that, but this clarified it.
        
        I’m not sure why you think it’s better to build a society of these systems than to build just a single one. It seems to just make things more difficult: instead of trying to make sure that one AI does things right, we need to make sure that the overall dynamic that emerges from a society of interacting AIs does things right. That sounds a lot harder.
        whpearson 15 Mar 2012 11:27 UTC
        0 points
        Parent
        A few reasons.
        
        1) I am more skeptical of singleton take off. While I think it is possible, I don’t think it is likely that humans will be able to engineer it.
        
        2)Logistics. If identity requires high bandwidth data connections between the two parts it would be easier to have a distributed system.
        
        3) Politics. I doubt politicians will trust anyone to build a giant system to look after the world.
        
        4) Letting the future take care of itself. If the systems do consider the human part of themselves, then they might be better placed to figure out an overarching way to balance everyones needs.