Pretty good overview of the AI boxing problem with respect to covert channels; possibly the first time I’ve see Eliezer’s experiments cited, or Stuart Armstrong’s Dr. Evil anthropic attack.
While the outlined informational hazards comprise over a dozen categories and are beyond the scope of this paper, it is easy to see how mental state of a person could be stressed to an unstable state. For example a religious guard could be informed of all the (unknown to him) contradictions in the main text of his religion causing him to question his beliefs and the purpose of life.
Given the length of the paper, I kind of expected there to be no mention of homomorphic encryption, as the boxing proposal that seems most viable, but to my surprise I read
The source code and hardware configuration of the system needs to be obfuscated (Yampolskiy & Govindaraju, 2007a) and important modules of the program should be provided only in the homomorphicly encrypted (Gentry, 2009) form, meaning that it could be used for computation or self-improvement (Hall, 2007), but not for self-analysis.
Important modules? Er, why not just the whole thing? If you have homomorphic encryption working and proven correct, the other measures may add a little security, but not a whole lot.
Pretty good overview of the AI boxing problem with respect to covert channels; possibly the first time I’ve see Eliezer’s experiments cited, or Stuart Armstrong’s Dr. Evil anthropic attack.
It says:
Careful analysis of the protocol used by Yudkowsky in conducting his AI-Box experiments reveals that they were unscientific and explicitly designed to show impossibility of confinement.
Roman V Yampolskiy paper
Pretty good overview of the AI boxing problem with respect to covert channels; possibly the first time I’ve see Eliezer’s experiments cited, or Stuart Armstrong’s Dr. Evil anthropic attack.
Given the length of the paper, I kind of expected there to be no mention of homomorphic encryption, as the boxing proposal that seems most viable, but to my surprise I read
Important modules? Er, why not just the whole thing? If you have homomorphic encryption working and proven correct, the other measures may add a little security, but not a whole lot.
It says:
Well, weren’t they? That was the whole point, I had the impression on SL4...