Dorikka comments on The Friendly AI Game

Dorikka 15 Mar 2011 21:04 UTC
1 point
I’m defining “shut down” to mean “render itself incapable of taking action (including performing further calculations) unless acted upon in a specific manner by an outside source.” The means of ensuring that the AI shuts down could be giving the state of being shut down infinite utility after it completed the goal. If you changed the goal and rebooted the AI, of course, it would work again because the prior goal is no longer stored in its memory.

If the AI makes nanobots which are doing something, I assume that the AI has control over them and can cause them to shut down as well.
What links here?
- Dorikka's comment on The Friendly AI Game by bentarm (15 Mar 2011 22:03 UTC; 1 point)
- roystgnr 15 Mar 2011 21:41 UTC
  8 points
  Parent
  How do we describe this shutdown command? “Shut down anything you have control over” sounds like the sort of event we’re trying to avoid.
  - Dorikka 15 Mar 2011 22:09 UTC
    1 point
    Parent
    What about “stop executing/writing code or sending signals?”
    
    As a side note, I consider that we’re pretty much doomed anyways if the AI cannot conceive of a way to deposit 100 USD into a bank account without using nanotech because that’s made the goal hard for the AI, which will cause it to pose similar problems to that of an AI with an unbounded utility function. The task has to be easy for it to be an interesting problem.
    - Nick_Tarleton 15 Mar 2011 22:25 UTC
      6 points
      Parent
      Even if it can deposit $100 with 99.9% probability without doing anything fancy, maybe it can add another .099% by using nanotech. Or by starting a nuclear war to distract anything that might get in its way (destroying the bank five minutes later, but so what). (Credit to Carl Shulman for that suggestion.)
      - Dorikka 15 Mar 2011 22:45 UTC
        0 points
        Parent
        From my estimation, all it needs to do is find out how to hack a bank. If it can’t hack one bank, it can try to hack any other bank that it has access to, considering that almost all banks have more than 100 USD in them. It could even find and spread a keylogger to get someone’s credit card info.
        
        Such techniques (which are repeatable within a very short timespan, faster than humans can react) seem much more sure than using nanotech or starting a nuclear war. I don’t think that distracting humans would really improve its chances of success because it’s incredibly doubtful that human’s could react so fast to so many different cyber-attacks.
        
        Possible, true, but the chances of this happening seem uber-low.