As I understand the current FHE schemes, the computer evaluating the encrypted result must know both the circuit and the public key to continue evaluating the computation, which allows some intentional analysis and modification. There’s even a proof that no completely obfuscated computation can exist.
https://en.wikipedia.org/wiki/Black-box_obfuscation
That leaves the AGI with a fairly simple solution: produce a (much) larger circuit that contains both the original circuit and its values along with an encrypted copy of the AGI which will allow the copy to examine the plaintext values under the FHE and modify the original circuit’s outcome as desired. To be safe, the new circuit can be made arbitrarily large enough to allow e.g. nested recursion of a FHE-turing machine running an embedded FHE computation which could be similarly modified the next FHE-level down. The physical AGI could trust its FHE instantiations to make the same (compute and space bounded) decisions it would, although it might never be able to break the FHE cipher and observe the results.
I am not a physicist so I don’t know if a similar operation could be done to entangle a working copy.of the AGI with an ongoing unknown quantum computation.
this is an interesting, i didn’t know about this property of known FHE schemes. if it is the case that being able to run a HEC necessarily also entails the ability to encrypt into it, then the solution you propose is indeed fine.
as for physical (as opposed to cryptographic) event horizons, we’d want superintelligence to send copies of itself past those anyways.
Not sure if you know this one already, but the OP links to a scott Aaronson post that goes into a similar direction. What would be unclear to me is, how would you actually, figure out how to Insert your AI in the right “format” into this foreign world? Not sure this makes any sense but say the computation simulates a human/observer watching a TV static (that is generate pseudorandomly) and takes as input Info about the human and an advice string for the TV static. How yould Insert the AI into this world, if no one left the manual behind?
i think the format could simply be to send into the HEC a transformation that take the entire world computation, and replaces it with a runtime containing the superintelligence at the top level, and giving it access to the simulated world such that it can examine it as much as it wants and decide when to keep it running and/or what to modify.
To add on this, I think if program obfuscation is actually intractable as you suggest with your link, then I think I get what you mean, but looking this up I got the impression it’s still an active area of research how much you can obfuscate programs against bounded agents. I also notice that I am confused about the differences/similarities between homomorphic encrypted computations and obfuscated computations. (It would seem to me like couldn’t you hide anything important about the computation in the encrypted data, thus both of these are essentially the same?)
I think the disproofs of black-box functions relies on knowing when the computation has completed, which may not be a consideration for continually running a simulation protected by FHE.
For example, if the circuit is equivalent to the electronic circuits in a physical CPU and RAM then a memory-limited computation can be run indefinitely by re-running the circuit corresponding to a single clock tick on the outputs (RAM and CPU register contents) of the previous circuit.
I can’t think of any obvious way an attacker could know what is happening inside the simulated CPU and RAM (or whether the CPU is in a halt state, or how many clock ticks have passed) without breaking the FHE encryption.
Nevertheless, encrypting the AGI gives that copy access to the plaintext values of the original simulation and control over the future of the simulation.
I think two major difference between garbled circuits, obfuscating computation, and FHE is that FHE can compute arbitrary circuits but it can’t hide portions of the computation from anyone who holds the private key, whereas e.g. the millionaire protocol gives two agents the ability to share a computation to which they both see the results but can’t see all inputs, but not all such zero knowledge problems have a simple algorithm like one might hope FHE would provide.
There’s also apparently no way for current FHE schemes to self-decrypt their outputs selectively, e.g. turn some of their ciphertext values into plaintext values after a computation is finished. In a sense this is an inherent security property of FHE since the circuits are public and so any ciphertext could be revealed with such a self-decrypting circuit, but it’s a very desirable property that would be possible with true black-box obfuscation.
As I understand the current FHE schemes, the computer evaluating the encrypted result must know both the circuit and the public key to continue evaluating the computation, which allows some intentional analysis and modification. There’s even a proof that no completely obfuscated computation can exist. https://en.wikipedia.org/wiki/Black-box_obfuscation
That leaves the AGI with a fairly simple solution: produce a (much) larger circuit that contains both the original circuit and its values along with an encrypted copy of the AGI which will allow the copy to examine the plaintext values under the FHE and modify the original circuit’s outcome as desired. To be safe, the new circuit can be made arbitrarily large enough to allow e.g. nested recursion of a FHE-turing machine running an embedded FHE computation which could be similarly modified the next FHE-level down. The physical AGI could trust its FHE instantiations to make the same (compute and space bounded) decisions it would, although it might never be able to break the FHE cipher and observe the results.
I am not a physicist so I don’t know if a similar operation could be done to entangle a working copy.of the AGI with an ongoing unknown quantum computation.
this is an interesting, i didn’t know about this property of known FHE schemes. if it is the case that being able to run a HEC necessarily also entails the ability to encrypt into it, then the solution you propose is indeed fine.
as for physical (as opposed to cryptographic) event horizons, we’d want superintelligence to send copies of itself past those anyways.
Not sure if you know this one already, but the OP links to a scott Aaronson post that goes into a similar direction. What would be unclear to me is, how would you actually, figure out how to Insert your AI in the right “format” into this foreign world? Not sure this makes any sense but say the computation simulates a human/observer watching a TV static (that is generate pseudorandomly) and takes as input Info about the human and an advice string for the TV static. How yould Insert the AI into this world, if no one left the manual behind?
i think the format could simply be to send into the HEC a transformation that take the entire world computation, and replaces it with a runtime containing the superintelligence at the top level, and giving it access to the simulated world such that it can examine it as much as it wants and decide when to keep it running and/or what to modify.
To add on this, I think if program obfuscation is actually intractable as you suggest with your link, then I think I get what you mean, but looking this up I got the impression it’s still an active area of research how much you can obfuscate programs against bounded agents. I also notice that I am confused about the differences/similarities between homomorphic encrypted computations and obfuscated computations. (It would seem to me like couldn’t you hide anything important about the computation in the encrypted data, thus both of these are essentially the same?)
I think the disproofs of black-box functions relies on knowing when the computation has completed, which may not be a consideration for continually running a simulation protected by FHE.
For example, if the circuit is equivalent to the electronic circuits in a physical CPU and RAM then a memory-limited computation can be run indefinitely by re-running the circuit corresponding to a single clock tick on the outputs (RAM and CPU register contents) of the previous circuit.
I can’t think of any obvious way an attacker could know what is happening inside the simulated CPU and RAM (or whether the CPU is in a halt state, or how many clock ticks have passed) without breaking the FHE encryption.
Nevertheless, encrypting the AGI gives that copy access to the plaintext values of the original simulation and control over the future of the simulation.
I think two major difference between garbled circuits, obfuscating computation, and FHE is that FHE can compute arbitrary circuits but it can’t hide portions of the computation from anyone who holds the private key, whereas e.g. the millionaire protocol gives two agents the ability to share a computation to which they both see the results but can’t see all inputs, but not all such zero knowledge problems have a simple algorithm like one might hope FHE would provide.
There’s also apparently no way for current FHE schemes to self-decrypt their outputs selectively, e.g. turn some of their ciphertext values into plaintext values after a computation is finished. In a sense this is an inherent security property of FHE since the circuits are public and so any ciphertext could be revealed with such a self-decrypting circuit, but it’s a very desirable property that would be possible with true black-box obfuscation.