Will_Sawin comments on Pascal’s Mugging: Tiny Probabilities of Vast Utilities

Will_Sawin 7 Jan 2011 15:26 UTC
0 points
You seem to be going far afield. The technical conclusion of the first argument is that one should spend all one’s resources dealing with cases with infinite or very high utility, even if they are massively improbable. The way I said it earlier was imprecise.

When humans deal with a problem they can’t solve, they guess. It should not be difficult to build an AI that can solve everything humans can solve. I think the “solution” to Godelization is a mathematical intuition module that finds rough guesses, not asking another agent. What special powers does the other agent have? Why can’t the AI just duplicate them.
- [deleted] 7 Jan 2011 19:28 UTC
  0 points
  Parent
  Thinking about it more, I agree with you that I should have phrased asking for Help better.
  
  Using Humans as the other agents, just duplicating all powers available to Humans seems like it would causes a noteworthy problem. Assume an AI Researcher named Maria follows my understanding of your idea. She creates a Friendly AI and includes a critical block of code:
  
  If UNFRIENDLY=TRUE then HALT;
  
  (Un)friendliness isn’t a Binary, but it seems like it makes a simpler example.
  
  The AI (since it has duplicated the special powers of human agents.) overwrites that block of code and replaces it with a CONTINUE command. Certainly it’s creator Maria could do that.
  
  Well clearly we can’t let the AI duplicate that PARTICULAR power. Even if it would never use it under any circumstances of normal processing (Something which I don’t think it can actually tell you under the halting problem.) It’s very insecure for that power to be available to the AI if anyone were to try to Hack the AI.
  
  When you think about it, something like The Pascal’s Mugging formulation is itself a hack, at least in the sense I can describe both as “Here is a string of letters and numbers from an untrusted source. By giving it to you for processing, I am attempting to get you to do something that harms you for my benefit.”
  
  So if I attempt to give our Friendly AI Security Measures to protect it from hacks turning it to an Unfriendly AI, These Security Measures seem like they would require it to lose some powers that it would have if the code was more open.
  - Will_Sawin 7 Jan 2011 19:45 UTC
    0 points
    Parent
    I think it makes more sense to design an AI that is robust to hacks due to a fundamental logic than to try to patch over the issues. I would not like to discuss this in detail, though—it doesn’t interest me.