cousin_it comments on Open Thread June 2010, Part 3

cousin_it 15 Jun 2010 18:03 UTC
2 points
Another idea for friendliness/containment: run the AI in a simulated world with no communication channels. Right from the outset, give it a bounded utility function that says it has to solve a certain math/physics problem, deposit the correct solution in a specified place and stop. If a solution can’t be found, stop after a specified number of cycles. Don’t talk to it at all. If you want another problem solved, start another AI from a clean slate. Would that work? Are AGI researchers allowed to relax a bit if they follow these precautions?

ETA: absent other suggestions, I’m going to call such devices “AI bombs”.
- timtyler 15 Jun 2010 21:16 UTC
  2 points
  Parent
  These ideas have already been investigated and documented:
  
  Box: http://fragments.consc.net/djc/2010/04/the-singularity-a-philosophical-analysis.html
  
  Stopping: http://alife.co.uk/essays/stopping_superintelligence/
- Vladimir_Nesov 15 Jun 2010 19:22 UTC
  1 point
  Parent
  
  Are AGI researchers allowed to relax a bit if they follow these precautions?
  
  If these precautions become necessary, end of the world will follow shortly (which is the only possible conclusion of “AGI research”, so I guess the researchers should rejoice at the work well done, and maybe “relax a bit”, as the world burns).
  - cousin_it 15 Jun 2010 19:31 UTC
    1 point
    Parent
    I don’t understand your argument. Are you saying this containment scheme won’t work because people won’t use it? If so, doesn’t the same objection apply to any FAI effort?
    - khafra 15 Jun 2010 19:41 UTC
      7 points
      Parent
      If my Vladimir-modelling heuristic is correct, he’s saying that you’re postulating a world where humanity has developed GAI but not FAI. Having your non-self-improving GAI solve stuff one math problem at a time for you is not going to save the world quickly enough to stop all the other research groups at a similar level of development from turning you and your boxed GAI into paperclips.
      - cousin_it 15 Jun 2010 19:49 UTC
        4 points
        Parent
        An AI in a simulated world isn’t prohibited from improving itself.
        
        More to the point, I didn’t imagine I would save the world by writing one comment on LW :-) My idea of progress is solving small problems conclusively. Eliezer has spent a lot of effort convincing everybody here that AI containment is not just useless—it’s impossible. (Hence the AI-box experiments, the arguments against oracle AIs, etc.) If we update to thinking it’s possible after all, I think that would be enough progress for the day.
        khafra 15 Jun 2010 20:44 UTC
        4 points
        Parent
        I don’t think it’s really an airtight proof—there’s a lot that a sufficiently powerful intelligence could learn about its questioners and their environment from a question; and when we can’t even prove there’s no such thing as a Langford Basilisk, we can’t establish an upper bound on the complexity of a safe answer. Essentially, researchers would be constrained by their own best judgement in the complexity of the questions and of the responses.
        
        Of course, all that’s rather unlikely, especially as it (hopefully) wouldn’t be able to upgrade its hardware—but you’re right, software-only self-improvement would still be possible.
        cousin_it 15 Jun 2010 21:10 UTC
        3 points
        Parent
        Yes, I agree. It would be safest to use such “AI bombs” for solving hard problems with short and machine-checkable solutions, like proving math theorems, designing algorithms or breaking crypto. There’s not much point for the AI to insert backdoors into the answer if it only cares about the verifier’s response after a trillion cycles, but the really paranoid programmer may also include a term in the AI’s utility function to favor shorter answers over longer ones.
    - Vladimir_Nesov 15 Jun 2010 19:50 UTC
      1 point
      Parent
      What khafra said—also this sounds like propelling toy cars using thermonuclear explosions. How is this analogous to FAI? You want to let the FAI genie out of the bottle (although it will likely need a good sandbox for testing ground).
      - cousin_it 15 Jun 2010 19:53 UTC
        1 point
        Parent
        Yep, I caught that analogy as I was writing the original comment. Might be more like producing electricity from small, slow thermonuclear explosions, though :-)
        Vladimir_Nesov 15 Jun 2010 20:10 UTC
        1 point
        Parent
        Not small explosions. Spill one drop of this toxic stuff and it will eat away the universe, nowhere to hide! It’s not called “intelligence explosion” for nothing.
        cousin_it 15 Jun 2010 20:22 UTC
        3 points
        Parent
        That’s right—I didn’t offer any arguments that a containment failure would not be catastrophic. But to be fair, FAI has exactly the same requirements for an error-free hardware and software platform, otherwise it destroys the universe just as efficiently.
        Vladimir_Nesov 15 Jun 2010 20:33 UTC
        0 points
        Parent
        Sure, prototypes of FAI will be similarly explosive.