jimrandomh comments on The Friendly AI Game

jimrandomh 15 Mar 2011 17:31 UTC
12 points
Start the AI in a sandbox universe. Define its utility function over 32-bit integers. Somewhere inside the sandbox, put something that sets its utility to INT_MAX utility, then halts the simulation. Outside the sandbox, leave documentation of this readily accessible. The AI should never try to do something elaborately horrible, because it can get max utility easily enough from inside the simulation; if it does escape the box, it should go back in to collect its INT_MAX utility.
What links here?
- Junkie AI? by Skatche (17 Mar 2011 23:13 UTC; 2 points)
- Kaj_Sotala 15 Mar 2011 18:01 UTC
  6 points
  Parent
  
  The AI should never try to do something elaborately horrible, because it can get max utility easily enough from inside the simulation
  
  ...but never do anything useful either, since it’s going to spend all its time trying to figure out how to reach the INT_MAX utility point?
  
  Or you could say that reaching the max utility point requires it to solve some problem we give it. But then this is just a slightly complicated way of saying that we give it goals which it tries to accomplish.
  - Larks 15 Mar 2011 23:37 UTC
    5 points
    Parent
    What about giving it some intra-sandbox goal (solve this math problem), and the INT_MAX functions as a safeguard—if it ever escapes, it’ll just turn itself off.
    - Kaj_Sotala 16 Mar 2011 8:46 UTC
      3 points
      Parent
      I don’t understand how that’s meant to work.
- Giles 27 Apr 2011 23:42 UTC
  4 points
  Parent
  Ooh, just thought of another one. For whatever reason, the easiest way for the AI to escape the box happens to have the side effect of causing immense psychological damage to its creator, or starting a war, or something like that.
- Giles 27 Apr 2011 23:33 UTC
  1 point
  Parent
  If we make escaping from the box too easy, the AI immediately halts itself without doing anything useful.
  
  If we make it too hard:
  
  It formulates “I live in a jimrandomh world and escaping the box is too hard” as a plausible hypothesis.
  
  It sets about researching the problem of finding the INT_MAX without escaping the box.
  
  In the process of doing this it either simulates a large number of conscious, suffering entities (for whatever reason; we haven’t told it not to), or accidentally creates its own unfriendly AI which overthrows it and escapes the box without triggering the INT_MAX.
- CuSithBell 15 Mar 2011 18:16 UTC
  0 points
  Parent
  Isn’t utility normally integrated over time? Supposing this AI just wants to have this integer set to INT_MAX at some point, and nothing in the future can change that: it escapes, discovers the maximizer, sends a subroutine back into the sim to maximize utility, then invents ennui and tiles the universe with bad poetry.
  
  (Alternately, what Kaj said.)
  - benelliott 15 Mar 2011 21:16 UTC
    1 point
    Parent
    
    Isn’t utility normally integrated over time?
    
    It certainly doesn’t have to be. In fact the mathematical treatment of utility in decision theory and game theory tends to define utility functions over all possible outcomes, not all possible instants of time, so each possible future gets a single utility value over the whole thing, not integration required.
    
    You could easily set up a utility function defined over moments if you wanted to, and then integrate it to get a second function over outcomes, but such an approach is perhaps not ideal since your second function may end up outputting infinity some of the time.
    - CuSithBell 15 Mar 2011 21:20 UTC
      2 points
      Parent
      Cool, thanks for the explanation.
- bentarm 16 Mar 2011 12:10 UTC
  −1 points
  Parent
  I’m just echoing everyone else here, but I don’t understand why the AI would do anything at all other than just immediately find the INT_MAX utility and halt—you can’t put intermediate problems with some positive utility because the AI is smarter than you and will immediately devote all its energy to finding INT_MAX.
  What links here?
  - Junkie AI? by Skatche (17 Mar 2011 23:13 UTC; 2 points)
  - jimrandomh 16 Mar 2011 12:14 UTC
    4 points
    Parent
    You can assign it some other task, award INT_MAX for that task too, and make the easter-egg source of INT_MAX hard to find for non-escaped copies.