gwern comments on Journal of Consciousness Studies issue on the Singularity

gwern 2 Mar 2012 19:29 UTC
13 points
Roman V Yampolskiy paper

Pretty good overview of the AI boxing problem with respect to covert channels; possibly the first time I’ve see Eliezer’s experiments cited, or Stuart Armstrong’s Dr. Evil anthropic attack.

While the outlined informational hazards comprise over a dozen categories and are beyond the scope of this paper, it is easy to see how mental state of a person could be stressed to an unstable state. For example a religious guard could be informed of all the (unknown to him) contradictions in the main text of his religion causing him to question his beliefs and the purpose of life.

Given the length of the paper, I kind of expected there to be no mention of homomorphic encryption, as the boxing proposal that seems most viable, but to my surprise I read

The source code and hardware configuration of the system needs to be obfuscated (Yampolskiy & Govindaraju, 2007a) and important modules of the program should be provided only in the homomorphicly encrypted (Gentry, 2009) form, meaning that it could be used for computation or self-improvement (Hall, 2007), but not for self-analysis.

Important modules? Er, why not just the whole thing? If you have homomorphic encryption working and proven correct, the other measures may add a little security, but not a whole lot.
- timtyler 3 Mar 2012 2:32 UTC
  5 points
  Parent
  
  Pretty good overview of the AI boxing problem with respect to covert channels; possibly the first time I’ve see Eliezer’s experiments cited, or Stuart Armstrong’s Dr. Evil anthropic attack.
  
  It says:
  
  Careful analysis of the protocol used by Yudkowsky in conducting his AI-Box experiments reveals that they were unscientific and explicitly designed to show impossibility of confinement.
  - gwern 3 Mar 2012 2:38 UTC
    8 points
    Parent
    Well, weren’t they? That was the whole point, I had the impression on SL4...