I think the disproofs of black-box functions relies on knowing when the computation has completed, which may not be a consideration for continually running a simulation protected by FHE.
For example, if the circuit is equivalent to the electronic circuits in a physical CPU and RAM then a memory-limited computation can be run indefinitely by re-running the circuit corresponding to a single clock tick on the outputs (RAM and CPU register contents) of the previous circuit.
I can’t think of any obvious way an attacker could know what is happening inside the simulated CPU and RAM (or whether the CPU is in a halt state, or how many clock ticks have passed) without breaking the FHE encryption.
Nevertheless, encrypting the AGI gives that copy access to the plaintext values of the original simulation and control over the future of the simulation.
I think two major difference between garbled circuits, obfuscating computation, and FHE is that FHE can compute arbitrary circuits but it can’t hide portions of the computation from anyone who holds the private key, whereas e.g. the millionaire protocol gives two agents the ability to share a computation to which they both see the results but can’t see all inputs, but not all such zero knowledge problems have a simple algorithm like one might hope FHE would provide.
There’s also apparently no way for current FHE schemes to self-decrypt their outputs selectively, e.g. turn some of their ciphertext values into plaintext values after a computation is finished. In a sense this is an inherent security property of FHE since the circuits are public and so any ciphertext could be revealed with such a self-decrypting circuit, but it’s a very desirable property that would be possible with true black-box obfuscation.
I think the disproofs of black-box functions relies on knowing when the computation has completed, which may not be a consideration for continually running a simulation protected by FHE.
For example, if the circuit is equivalent to the electronic circuits in a physical CPU and RAM then a memory-limited computation can be run indefinitely by re-running the circuit corresponding to a single clock tick on the outputs (RAM and CPU register contents) of the previous circuit.
I can’t think of any obvious way an attacker could know what is happening inside the simulated CPU and RAM (or whether the CPU is in a halt state, or how many clock ticks have passed) without breaking the FHE encryption.
Nevertheless, encrypting the AGI gives that copy access to the plaintext values of the original simulation and control over the future of the simulation.
I think two major difference between garbled circuits, obfuscating computation, and FHE is that FHE can compute arbitrary circuits but it can’t hide portions of the computation from anyone who holds the private key, whereas e.g. the millionaire protocol gives two agents the ability to share a computation to which they both see the results but can’t see all inputs, but not all such zero knowledge problems have a simple algorithm like one might hope FHE would provide.
There’s also apparently no way for current FHE schemes to self-decrypt their outputs selectively, e.g. turn some of their ciphertext values into plaintext values after a computation is finished. In a sense this is an inherent security property of FHE since the circuits are public and so any ciphertext could be revealed with such a self-decrypting circuit, but it’s a very desirable property that would be possible with true black-box obfuscation.