datadataeverywhere comments on Cryptographic Boxes for Unfriendly AI

datadataeverywhere 18 Dec 2010 11:19 UTC
0 points

Homomorphic encryption does allow me to run a program without learning anything about the program.

This is definitely true in some cases; even if we were talking about obfuscation, Barak only proves that a universal obfuscater is impossible. As you noted, if a program is homomorphically encrypted, we learn the program’s length. We also learn how long it takes to execute certain operations. We can observe any side effects of the running program, which don’t really change. If we are allowed to give it inputs that haven’t been also homomorphically encrypted (I know you don’t plan to), we might learn much more about the program.

I’m not sure that any of those properties make what you propose a Bad Idea, or even render it unworkable, since our adversary is the AI, not the compiler/translator. On the other hand, I’m still not sure what we’re gaining from the scheme. What are we scared of? I assume we are worried that our AI might write a virus for the verifier, but the AI can attempt that whether or not it has the key to decrypt its own output (though, I’ll grant that it might be much more difficult). Once it does that, it can output an AI that it understands, even if the compiler/translator provably can’t, and send the signal that the verifier was supposed to.

So this is really my objection; a homomorphically encrypted program still understands itself, and its inputs and outputs (it couldn’t work any other way). The guarantee is only that the interpreter running the program cannot decipher any of those without the key, and that doesn’t seem to help in this case.
- paulfchristiano 18 Dec 2010 18:35 UTC
  4 points
  Parent
  A homomorphically encrypted program doesn’t really understand the environment it is running in, in that it is computationally incapable of having any effect on the world until you bring it the secret key. The AI can write a virus for the verifier in the sense of a program which causes the verifier to incorrectly accept (if our test for friendliness was not actually correct) but it can’t write some code which, when the verifier attempts to verify it, does horrible damage to the universe. It can’t do anything except try and trick the verifier. This seems to me like the best guarantee you could possibly hope for.
  - gwern 23 Dec 2010 23:51 UTC
    5 points
    Parent
    
    If you run a homomorphically encrypted AI, sufficiently isolated from the private key, you get the very strong guarantee that the AI’s execution has no computationally detectable effect on the rest of the world.
    
    A homomorphically encrypted program doesn’t really understand the environment it is running in, in that it is computationally incapable of having any effect on the world until you bring it the secret key.
    
    I don’t understand something here. I read many of the articles on the homomorphic encryption research and skimmed the article itself, and the impression I had was that homomorphic encryption only meant that the encrypted data could be manipulated by an unencrypted program (a native program, machine code) to produce another batch of encrypted data.
    
    This is interesting, yes. But I don’t see how it protects us from a sandboxed AI. You give the AI its homomorphically encrypted data, and it goes, hmm, very interesting, I’ll put it over here and in the meantime let’s examine the jnz instruction* - oh look, it probably gives off some radiation from the flipping gate which may let me tamper with a WiFi broadcast and let me out onto another x86-based computer from which I can access the broader Internet… etc.
    
    Any encrypted output may be 100% safe, but what about the side-effects? In functional programming, we always remember that ‘purely functional’ only means that our functions don’t have any explicit side-effects but said functions can still easily have side-effects like using 10GB of RAM and causing the OOM killer to be invoked. And cryptography is well-aware of side-channel attacks these days.
    
    * example totally made up
    - paulfchristiano 24 Dec 2010 1:03 UTC
      4 points
      Parent
      Sorry that this was unclear. The intended use is that the code of the AI itself is the encrypted input, and the plaintext program manipulating it is an interpreter for the language in which the AI is written. For example, the AI could be specified as a LISP program and a homomorphically encrypted copy would be given as input to a LISP interpreter.
      - gwern 24 Dec 2010 1:29 UTC
        0 points
        Parent
        So (in Haskell notation) our total set up would be something like eval (evilAI ++ problemDescription ++ verifier) where eval :: EncryptedBitstream -> EncryptedBitstream, and the homomorphic argument/input to our evaluator is effectively one gigantic inlined/combined function with no further input?
        paulfchristiano 24 Dec 2010 2:19 UTC
        0 points
        Parent
        Correct. The result of that call then gets decrypted.
        
        In Haskell, HomomorphicEncryption would probably be a monad.
        
        Edit: actually, it makes sense syntactically for it to be a monad but Haskell doesn’t really have the machinery to make it work so scratch that.
        gwern 24 Dec 2010 3:15 UTC
        0 points
        Parent
        No no, Haskell probably does. Look at the ST monad—you go ‘into’ it and then you can perform all sorts of destructive updates but purity is preserved by having to ‘escape’ it before its results are available anywhere else.
        paulfchristiano 24 Dec 2010 4:00 UTC
        0 points
        Parent
        The problem is that the only thing Haskell can do with functions is use them as black boxes, to the best of my knowledge. To apply a function to homomorphically encrypted data, you can’t use it as a black box—you need to use an explicit description of the function.
        gwern 24 Dec 2010 4:30 UTC
        0 points
        Parent
        Well, my point was that you can do something akin to Perl’s taint - force operations to be done only within a particular type context.
        
        So you could do something similar to the ST monad but instead of accepting functions which generate any type output, it operates on, say, a parse tree/ADT representing a Lisp function which is evaluated with the rest of the homomorphic data.
        
        But it’s not really important; any such strategy would probably be done in a new language (for efficiency, if nothing else) and the into/escape invariant enforced by manual code inspection or something.
    - jimrandomh 24 Dec 2010 0:21 UTC
      2 points
      Parent
      In order to hack the substrate, the AI would probably need to crack the homomorphic encryption key used to run it. Noticing that jnz gives off radiation won’t do any good if both the timing of when that instruction runs and the results of running it are on the other side of a computationally hard problem.