Viliam_Bur comments on Tools want to become agents

Viliam_Bur 4 Jul 2014 12:23 UTC
4 points

tools start to become agents after invention X

Seems like X is (or includes) the ability to think about self-modification: awareness of its own internal details and modelling their possible changes.

Note that without this ability the tool could invent a plan which leads to its own accidental destruction (and possibly not completing the plan), because it does not realize it could be destroyed or damaged.
- NancyLebovitz 5 Jul 2014 13:46 UTC
  0 points
  Parent
  An agent can also accidentally pursue a plan which leads to its self-destruction. People do it now and then by not modelling the world well enough.
- TheAncientGeek 5 Jul 2014 13:06 UTC
  0 points
  Parent
  I think of agents having goals and pursuing them by default. I dont see how self reflexive abilities.… ” think about self-modification: awareness of its own internal details and modelling their possible changes.”...add up to goals. It might be intuitive that a self aware entity would want to preserve its existence...but that intuition could be driven by anthropomorphism, (or zoomorphism , or biomorphism)
  - Viliam_Bur 5 Jul 2014 18:06 UTC
    0 points
    Parent
    With self-reflective abilities, the system can also consider paths including self-modification in reaching its goal. Some of those paths may be highly unintuitive for humans, so we wouldn’t notice some possible dangers. Self-modification may also remove some safety mechanisms.
    
    A system that explores many paths can find a solutions humans woudln’t notice. Such “creativity” at object level is relatively harmless. Google Maps may find you a more efficient path to your work than the one you use now, but that’s okay. Maybe the path is wrong for some reasons that Google Maps does not understand (e.g. it leads through a neighborhood with high crime), but at least on general level you understand that such is the risk of following the outputs blindly. However, similar “creativity” at self-modification level can have unexpected serious consequences.
    - Mark_Friedenbach 6 Jul 2014 1:19 UTC
      0 points
      Parent
      “the system can also”, “some of those paths may be”, “may also remove”. Those are some highly conditional statements. Quantify, please, or else this is no different than “the LHC may destroy us all with a mini black hole!”
      - Viliam_Bur 6 Jul 2014 9:50 UTC
        1 point
        Parent
        I’d need to have a specific description of the system, what exactly it can do, and how exactly it can modify itself, to give you a specific example of self-modification that contributes to the specific goal in a perverse way.
        
        I can invent an example, but then you can just say “okay, I wouldn’t use that specific system”.
        
        As an example: Imagine that you have a machine with two modules (whatever they are) called Module-A and Module-B. Module-A is only useful for solving Type-A problems. Module-B is only useful for solving Type-B problems. At this moment, you have a Type-A problem, and you ask the machine to solve it as cheaply as possible. The machine has no Type-B problem at the moment. So the machine decides to sell its Module-B on ebay, because it is not necessary now, and the gained money will reduce the total cost of solving your problem. This is short-sighted, because tomorrow you may need to solve a Type-B problem. But the machine does not predict your future wishes.
        Mark_Friedenbach 6 Jul 2014 11:26 UTC
        0 points
        Parent
        
        I can invent an example, but then you can just say “okay, I wouldn’t use that specific system”.
        
        But can’t you see, that’s entirely the point!
        
        If you design systems whereby the Scary Idea has no more than a vanishing likelihood of occurring, it no longer becomes an active concern. It’s like saying “bridges won’t survive earthquakes! you are crazy and irresponsible to build a bridge in an area with earthquakes!” And then I design a bridge that can survive earthquakes smaller than magnitude X, where X magnitude earthquakes have a likelihood of occurring less than 1 in 10,000 years, then on top of that throw an extra safety margin of 20% on because we have the extra steel available. Now how crazy and irresponsible is it?
        Viliam_Bur 6 Jul 2014 19:26 UTC
        0 points
        Parent
        
        If you design systems whereby the Scary Idea has no more than a vanishing likelihood of occurring, it no longer becomes an active concern.
        
        Yeah, and the whole problem is how specifically will you do it.
        
        If I (or anyone else) will give you examples of what could go wrong, of course you can keep answering by “then I obviously wouldn’t use that design”. But at the end of the day, if you are going to build an AI, you have to make some design—just refusing designs given by other people will not do the job.
        Mark_Friedenbach 6 Jul 2014 20:01 UTC
        1 point
        Parent
        There are plenty of perfectly good designs out there, e.g. CogPrime + GOLUM. You could be calculating probabilistic risk based on these designs, rather than fear mongering based on a naïve Bayes net optimizer.