In the last blog post, I introduced my plan to make the safest cryptographic box in the world, and to make it widely available. This would, in theory, make it possible to run infinitely dangerous programs (including superintelligences) safely. This cryptographic box was supposed to use a scheme that is fully homomorphic and perfectly secret at the same time. However, in the last four months, something has changed:
I managed to find a partial fix for the scheme and to implement it. I will prove that this fix enables us to safely evaluate every Clifford gate. But this fix still doesn’t enable us to safely evaluate every quantum gate.
However, not all hope is lost. Although this specific scheme, made by Min Liang, has a flaw, there is another QFHE scheme with perfect secrecy made by the same researcher. In the future, I will try to implement this second QFHE scheme instead, hoping that it does not have any flaw.
This post explains in more details these points. I will also raise concerns about whether such QFHE scheme could help malevolent actors hide the training of superintelligences. If you know about any other FHE or QFHE scheme with perfect secrecy than the ones I have cited, please let me know.
Prerequisites to understand this post
In quantum computing, quantum gates are represented by matrices, and quantum states are either represented as vectors or as density matrices. If we want to apply a gate G on a state s in the vector paradigm, we compute G⋅s, whereas if we want to do so in the density matrix paradigm, we compute G⋅s⋅G−1. I will use the density matrix paradigm, as it gives more information, and because it is the paradigm used in the article that introduced this QFHE scheme. To understand this post, we need to know about the following quantum gates and quantum states:
X=(0110), Y=(0−ii0), and Z=(100−1) are called the Pauli gates. The most familiar of these is the X gate, which is a generalization of the classical NOT gate. Indeed, it transforms zeros into ones and ones into zeros. The other two gates do not have any analog gate in classical programming. They are called X, Y, and Z because they flip the state of the qubit 180 degrees along the x, y, and z axes.
CNOT=⎛⎜
⎜
⎜⎝1000010000010010⎞⎟
⎟
⎟⎠ is a two-qubit gate which applies an X gate on the second qubit if and only if the first one is active. We call it the controlled NOT gate because it applies a NOT gate on the target qubit according to the control qubit. We often say that it is a generalization of the XOR gate, because it sets the target qubit to the XOR between itself and the control qubit. Along with the Hadamard gate H=1√2(111−1) and the gate S=(100i), they form a group of gates called the Clifford group. This group contains the Pauli group.
Rx(θ)=(cos(θ/2)−isin(θ/2)−isin(θ/2)cos(θ/2)), Ry(θ)=(cos(θ/2)−sin(θ/2)sin(θ/2)cos(θ/2)), and Rz(θ)=(e−iθ/200eiθ/2) are called the rotation operator gates. They are generalizations of the Pauli gates: They perform a rotation of θ radians along the x, y, and z axes. They together can form any one-qubit gate.
U(α,β,γ,δ)=eiαRz(β)Ry(γ)Rz(δ) is the universal one-qubit gate, which also can form any one-qubit gate. To do so, it takes four parameters. It can represent any one-qubit gate. Together with CNOT, they form a universal gate set, which means that any quantum operation (and therefore any classical operation) can be performed using only these gates.
Ik is the identity matrix of length k. It leaves everything unchanged.
ek is the vector that contains zeros everywhere except for a 1 at position k, whereas ek,k is the matrix that contains zeros everywhere except for a 1 at position (k,k).
⊗ is the Kronecker product, which converts a k-qubit gate and a l-qubit gate into a (k+l)-qubit gate.
A matrix to the power of zero is equal to the identity matrix.
The scheme
Encryption
If we want to encrypt a message of n bits, we firstly have to transform it into a standard unit vector. For instance, to encrypt the sequence of bits 011, we first convert it into decimal (in which case we get 3), and we then convert it into the vector of 2n elements with zeros everywhere except for a 1 at position 3+1, which is the vector e3+1:
After that, we convert it into a density matrix that we note ρ (we call it the plaintext). As the message is not a superposition, the density matrix is simply a square matrix whose diagonal is the vector we just computed:
Then, we select two random sequences a and b, each of length n. These sequences are called the keys. With these sequences, we can now compute the ciphertext, which is:
σ=(Xa1Zb1⊗⋯⊗XanZbn)⋅ρ⋅(Xa1Zb1⊗⋯⊗XanZbn)−1
Evaluation of CNOT
To apply a CNOT gate on the i-th bit according to the (i−1)-th bit, we firstly have to compute a matrix CNOT′ like so:
CNOT′=CNOT⋅((−1)ai−1biZbi⊗Xai−1)
Then, we update σ according to this formula:
σ←(I2i−2⊗CNOT′⊗I2n−i)⋅σ⋅(I2i−2⊗CNOT′⊗I2n−i)−1
Evaluation of the universal one-qubit gate
To apply U(α,β,γ,δ) on the i-th bit, we compute U′(α,β,γ,δ) like so:
To get back the plaintext from the ciphertext, we use the following formula:
ρ=(Xa1Zb1⊗⋯⊗XanZbn)⋅σ⋅(Xa1Zb1⊗⋯⊗XanZbn)−1
At this point, ρ is the density matrix we were looking for. If the result is not a superposition, then we can transform ρ into a standard unit vector of the form ej+1 by taking the diagonal (If the diagonal is not a standard unit vector, or that the matrix has non-zero elements outside of the diagonal, then the result is not a classical state). Finally, we convert j back to binary to get the initial sequence of bits.
How the scheme looks like in Python
I coded the QFHE scheme in Python. It may help you understand how the scheme is supposed to work. If you are interested, you can find my programs here. One program shows how to implement the Toffoli gate using the scheme, and another one shows how to implement Rule 60, which outputs a Sierpiński triangle (but a very small one because we are trying to simulate a quantum computer inside of a classical computer, which takes a lot of time to compute).
The problem
The problem in the evaluation of CNOT
Suppose we have a ciphertext σ corresponding to a plaintext ρ that contains only classical states. Let’s try to homomorphically apply a CNOT, where the control qubit is ρi and the target qubit is ρj. From this operation, we obtain a ciphertext σ′ corresponding to a plaintext ρ′. σ=σ′ is equivalent to Decrypta,b(σ)=Decrypta,b(σ′), which is equivalent to ρ=ρ′, which is equivalent to ρj=ρ′j, which is equivalent to saying that the target qubit of the CNOT didn’t change, which is equivalent to saying that the control qubit of the CNOT was equal to 0, which is equivalent to ρi=0. Therefore, σ=σ′⟺ρi=0. In other words, by checking whether the ciphertext changed after homomorphically applying a CNOT, we can deduce the value of the control qubit in the plaintext.
The same problem in the evaluation of the universal one-qubit gate
Initially, I thought that the problem arose only for the evaluation of CNOT. For the problem to arise for one-qubit gates, we would need a one-qubit gate G and two states S1 and S2 such that applying G on S1 changes S1, but that applying G on S2 doesn’t change S2. I initially failed to find such a gates, and wrongly thought that it may not exist. However, such gates do exist. For instance, applying the X gate on the |0⟩ state does change it, but applying it on the |+⟩ state doesn’t change it. Therefore, if Alice takes one of these two states randomly, encrypts it, and asks you to homomorphically perform an X gate on it, then you could guess which state she chose by looking at whether the ciphertext changes after evaluating the X gate.
Actually, this problem arises for every one-qubit gate, except for the identity gate. Indeed, since a one-qubit gate corresponds to a rotation of the Bloch sphere, and that every non-null rotation of a sphere has exactly two fixed points, then for every one-qubit gate that is different than the identity gate, there are exactly two inputs that do not change, while the other inputs do change.
A more general version of the problem
Suppose you have a program which, when given the sequence 000, loops back after three timesteps, but which, when given the sequence 111, loops back after two timesteps. Suppose then that Alice took one of these two possible inputs randomly, encrypted it, sent the ciphertext to you, and then asked you to homomorphically run the program on the ciphertext. When running the program, you will see that the ciphertext will loop either every two timesteps or every three timesteps. From this observation, you will be able to guess whether the input Alice took was 000 or 111. That is, you learned something from the plaintext by just applying the evaluation phase.
It turns out that this problem is a generalization of the first one: The first problem is the specific case where the program loops back instantly according to one input, but loops back after two timesteps according to another input.
What we would need to solve this problem
When the plaintext/program loops back, we would want the ciphertext not to loop back too. In other words, we would want “same plaintext → different ciphertexts” to be possible. To do this, we need a key update. But, which key update do we need? To answer this question, I will show in the next section that, in classical computing, the XOR gate can be evaluated safely. By “evaluated safely”, I mean that no one involved in the computation can learn anything about the program, unless they cheat by communicating to each other.
How we would solve the problem in classical computing
The XOR gate can be evaluated safely
Suppose that Phoebe has a plaintext, that she generates a random key of identical length, and that she XORs the two to get a ciphertext (this is the One-time pad that was discussed it in the last post). Then, she sends the key to Kevin and the ciphertext to Charlie. In this way, neither Kevin and Charlie can have any information about the plaintext. After a while, Kevin and Charlie give back the key and the ciphertext, that they modified without communicating to each other or to Phoebe. Phoebe then XORs the two results to get a new plaintext. Now the question is, is there a scheme that Kevin and Charlie can follow in order to compute any function in this way?
I do not know whether this is possible. But it is very easy to see that at least one function is computable in this way. The most obvious example is the identity function, where Kevin and Charlie just do nothing. However, are there other functions? The answer is yes. Actually, every circuit C that consist only of XOR gates can can be computed in this way. To do so, Kevin and Charlie just need to apply C on the key and the ciphertext. For instance, in this picture, Kevin and Charlie help Phoebe to compute a circuit that consists of a single XOR gate:
This shows that it is possible to run any infinitely dangerous program safely as long as they consist only of XOR gates. Indeed, if the program could affect Kevin or Charlie, then they would have gained information about the plaintext, which would contradict the perfect secrecy of the One-time pad.
Are there other classical gates evaluable in this way?
By looking at the picture closely, we can see that asking ourselves whether we can evaluate a function F in this way is equivalent to asking ourselves whether there exist two functions Kevin and Charlie such that:
Kevin(k1,k2)⊕Charlie(k1⊕ρ1,k2⊕ρ2)=F(ρ1,ρ2)
If we assume that Kevin and Charlie are classical circuits, then the only classical circuits that we can compute in this way are the ones that can be built using only XOR gates (Technically, I considered only two-bit and three-bit programs, but it seems very unlikely to me that adding more bits would change the result). To show this, I created a program that brute-forces the problem:
"""We want to find Kevin and Charlie such that Kevin(k1, k2) ⊕ Charlie(k1 ⊕ p1, k2 ⊕ p2) = F(p1, p2)"""
import itertools
def allBinaryFunctions():
return [(lambda x, y, array=arr: array[2*x + y]) for arr in itertools.product([0, 1], repeat=4)]
for F in allBinaryFunctions():
for Kevin, Charlie in itertools.product(allBinaryFunctions(), repeat=2):
areValid = True
for (k1, k2, p1, p2) in itertools.product([0, 1], repeat=4):
areValid &= Kevin(k1, k2) ^ Charlie(k1 ^ p1, k2 ^ p2) == F(p1, p2)
if not areValid:
break
if areValid:
print("Found", F(0, 0), F(0, 1), F(1, 0), F(1, 1))
print("Kevin :", Kevin(0, 0), Kevin(0, 1), Kevin(1, 0), Kevin(1, 1))
print("Charlie :", Charlie(0, 0), Charlie(0, 1), Charlie(1, 0), Charlie(1, 1), "\n")
break
This program outputs every two-bit functions that Kevin and Charlie can compute together. However, it turns out that all of those programs are exactly those that can be built using only the XOR gate.
This is unfortunate for us, because the XOR gate isn’t a universal gate. However, not all hope is lost, as we focused only on classical computing. Is there a way to compute more functions than that using quantum computing?
I will now show that a similar technique can be applied for the QFHE scheme. However, there are still a few differences. For instance, in the QFHE scheme, we have two keys instead of one.
Which quantum gates can be evaluated safely without ancilla qubits?
An algorithm to know whether a one-qubit gate can be evaluated safely without ancilla qubits
Suppose, without loss of generality, that our ciphertext has a single qubit. We would want to safely evaluate a one-qubit gate G without any ancilla qubit. Kevin will be given the two keys, which are the two bits a1 and b1, whereas Charlie will be given the ciphertext, which is a qubit σ. Kevin and Charlie will have to update (a1,b1) and σ respectively, using two functions f,g:B×B→B and a quantum gate G′ like so:
σ←G′⋅σ⋅G′−1a1,b1←f(a1,b1),g(a1,b1)
For the evaluation to be valid, it should be the case that encrypting a plaintext ρ, then applying the update, and then decrypting it gives us the right plaintext. In other words, it should be the case that, for all (a1,b1)∈B2 and all ρ, this equation holds:
Firstly, given G′, f and g, how can we verify whether the equation holds for every (a1,b1) and ρ? We cannot enumerate all the possible ρ, as there is an infinite amount of them. We could also try a lot of different ρ, but this would take a while, and we may still miss some rare exceptions. To solve this problem, I will firstly have to show that for every gate A and B, we have ACA−1=BCB−1 for every state C if and only if there exists λ such that A=λB. (This post initially had a proof that worked only for 2×2 reversible matrices. Thankfully, one of my classmates, Julia PHAM BA NIEN, gave me a proof that works for every reversible matrix, which is now included here).
ACA−1=BCB−1⟺B−1ACA−1A=B−1BCB−1A⟺(B−1A)C=C(B−1A)
Therefore, B−1A has to commute with every other matrices. Let’s prove that only matrices of the form λI do so. Firstly, it is trivial that every matrix of the form λI commutes with every other matrices, as (λI)C=λC=λCI=C(λI). Now, let’s prove that every matrix that commutes with every other matrices are of the form λI. Let P be a matrix that commutes with every other matrices. Note in particular that, for every i, we have:
Therefore, for every i and j, we have αi,j=0 if i ≠j, and αi,j=α1,1 if i=j. Therefore, all non-diagonal elements are equal to zero, and all diagonal elements are pairwise equal. As such, P is of the form λI. In our case, we therefore have B−1A=λI, and therefore A=λB, which concludes the proof.
This property is useful to us, because our equation can be rewritten in the form ACA−1=BCB−1 like so:
Therefore, given G′, f and g, we can verify whether the above equation holds for every (a1,b1) and ρ by checking that, for every (a1,b1), there exists λ such that:
((Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1))=λG
Now that we know how to test in finite time whether a given G′, f and g enable us to safely evaluate G, how can we find such G′, f and g? Since there are infinitely many quantum gates, we cannot enumerate all the possible gates G′. However, we can do better than this. Indeed, if we look at the equation that we deduced, we have:
Therefore, to verify whether G can be evaluated safely, we can do as follows: For every function f and g, we try to see whether, for all (a1,b1), the result of (Xf(a1,b1)Zg(a1,b1))−1⋅G⋅(Xa1Zb1)−1 stays the same up to a scalar (as we removed λ). If we find such f and g, then it means that we can safely evaluate G. After finding f and g, we can choose any G′ that is equal to (Xf(a1,b1)Zg(a1,b1))−1⋅G⋅(Xa1Zb1)−1 up to a scalar. Here, the choice of (a1,b1) shouldn’t matter when we generate G′ (more precisely, we may get different G′, but they would all work, as they would differ only by a scalar).
What happens if we don’t find such f and g? It would mean that, either the gate cannot be evaluated safely, or that this gate needs more than one qubit. It may be that the second scenario never happens, and that if we don’t find such f and g, then this gate is out of reach forever, whether or not we use ancilla qubits. However, I haven’t investigated this yet.
Combinatorial explosion of generalizing this algorithm for n-qubit gates
The algorithm shown above can theoretically be generalized for n-qubit gates. For every (f1,…,fn) and (g1,…,gn), we look at whether, for all a=(a1,…,an) and b=(b1,…,bn), this stays the same up to a scalar:
If we find such (f1,…,fn) and (g1,…,gn), we can choose any G′ that is equal to this up to a scalar.
However, this program suffers from combinatorial explosion. Indeed, even for 2-qubit gates, we need to look at every quadruple of functions (f1,f2,g1,g2) that take four parameters each. That is, we need to look at (2(24))4=264=18446744073709600000 quadruples of functions.
The Clifford group is the group of gates that can be made from {CNOT,H,S}. It is not universal, but it contains useful gates like the Pauli group that can be made from{X,Y,Z}. Using the algorithm of the last section, we can prove that H and S can be evaluated safely. Finding a way to safely evaluate the CNOT gate is a bit harder, as we cannot use the algorithm because of combinatorial explosition. However, with a bit of math and brute-force, I managed to find a safe evaluation of CNOT. Therefore, every Clifford gate can be evaluated safely. Although the algorithm finds many different ways to evaluate H and S, I will present only the ones that I consider the simplest.
Evaluation of CNOT
Instead of computing CNOT′, we will instead update σ, a and b like so:
Here is the reason why this will give us the right result: Without loss of generality, let’s assume that our initial sequence has only two bits. We would want to encrypt ρ, apply a CNOT on it in this way, and decrypt it. In that case, the plaintext we will obtain is:
Which is the plaintext that we wanted. The first equality holds because CNOT⋅(Xa1Zb1⊗Xa2Zb2) is equal to (Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)⋅CNOT. I found this result by creating a program that brute-forces the problem, which you can see here. The second equality holds because applying twice the same sequence of X and Z on a quantum state yields back the same quantum state.
Evaluation of H
To apply a H gate on the i-th qubit, we update σ, a and b like so:
σ←(I2i−1⊗H⊗I2n−i)⋅σ⋅(I2i−1⊗H⊗I2n−i)−1ai,bi←bi,ai
Indeed, without loss of generality, suppose that our plaintext ρ has a single qubit. In that case, when we encrypt it, apply a H gate on it in this way, and decrypt it, then the plaintext we obtain is:
The equality holds because (Xb1Za1)⋅H⋅(Xa1Zb1)=H for all (a1,b1)∈B. This technique was firstly proposed by Andrew Childs in this article in a slightly different context.
Evaluation of S
To apply a S gate on the i-th qubit, we do:
σ←(I2i−1⊗S⊗I2n−i)⋅σ⋅(I2i−1⊗S⊗I2n−i)−1bi←ai⊕bi
Now, to prove that the evaluation of S gives the right result, we ask ourselves whether encrypting a qubit, then applying S on it in the way shown above, and then decrypting it gives us the right plaintext. In other words, we ask ourselves whether this equation holds:
And indeed, this equation holds because, for all (a1,b1), ((Xa1Za1⊕b1)⋅S⋅(Xa1Zb1)) differs from S by only a scalar.
The T gate cannot be evaluated safely without ancilla qubits
Earlier, I said that the Clifford group made from {H,S,CNOT} is not universal. However, if we augment this group by adding the T gate, then the group becomes universal. Therefore, it may be interesting to see whether the T gate can be evaluated safely, as this would enable us to evaluate safely every program.
Unfortunately, the algorithm doesn’t find any way to safely evaluate the T gate. Because of the way that the algorithm was made, this proves that the T gate cannot be safely evaluated without ancilla qubits.
Can the T gate be safely evaluated with ancilla qubits? As Min Liang has made another QFHE scheme with perfect secrecy which uses ancilla qubits for the T gate, it may be possible, as long as this QFHE scheme doesn’t contain any flaw. I haven’t investigated this yet, and this is my next step in my plan.
Concerns about bad actors hiding the training of superintelligences
Without FHE with perfect secrecy, bad actors could hide the training of superintelligences by making air-gapped computers hidden from the public. They may be caught in this process, or they may not have enough resources. However, if they had an efficient FHE scheme with perfect secrecy, they could directly use the computing power of the Cloud, which is way easier to do, and which would make them able to use way more computing power. This would help them avoid getting caught by authorities, and therefore would help them bypass any safety regulation.
After a few days of thought, I concluded that the problem does not arise from the fact that the QFHE scheme I try to develop is perfectly secret. Indeed, although FHE schemes without perfect secrecy aren’t safe enough in order to control misaligned superintelligences, they are safe enough to avoid getting broken by authorities.
Instead, the problem is that the FHE scheme I try to develop has a low time complexity. Indeed, current FHE schemes have a very high time complexity, which makes them useless in order to hide the training of superintelligences.
However, the high time complexity of current FHE schemes can already be partially bypassed by bad actors, by using only partially homomorphic encryption schemes or FHE schemes that work well for very specific computations. These schemes can have a time complexity low enough in order to run neural networks, where the most famous example is this article.
I started getting worried since the moment I first thought about the possibility that bad actors would try to hide their attempts to train superintelligences. I started thinking about whether I should stop my research. I am still unsure about whether it causes any harm, as the problem also somewhat arises from partially homomorphic encryption schemes. The way that my plan, if achieved, could help bad actors is by making the encryption of programs automatic (there is no need to change the code or the architecture of the superintelligence, as long as it is written for quantum computers), and making their evaluation faster because of a lower time complexity.
Personally, I think that bad actors will mostly be caught before starting to train their superintelligence. Indeed, it seems very likely that developing the source code of the superintelligence needs way more people and time than the superintelligence’s training. However, helping bad actors not to be caught during the training is still a significant help for them, and therefore it is a significant harm for everyone else.
Overall, I think that bad actors already have other ways to hide the training of superintelligences while using the computing power of the Cloud, but that QFHE schemes may have a significant chance to help them in this process. I do not know whether the safety that the QFHE provides against superintelligences is outweighted by the risks of bad actors hiding the training of superintelligences. I hope not, but if most people think that my research is net negative, I will stop it.
Conclusion
In this post, we have seen that the symmetric QFHE scheme had a flaw that enables us to obtain information about the plaintext by just making computations on the ciphertext. To prevent this, we added a key update that is independent of the ciphertext, and we made the ciphertext update independent of the key. This enables us to safely evaluate any program made out of Clifford gates, which means that no party involved can have any information about the plaintext, unless they communicate to each other (which is forbidden). However, this fix doesn’t work for the T gate, which means that if this gate can be evaluated safely, then it needs at least one ancilla qubit. I also raised my concerns about whether QFHE schemes could help malevolent actors train superintelligences without being detected by authorities. In the future, I will try to find whether there are non-Clifford gates that can be evaluated safely.
Disproving and partially fixing a fully homomorphic encryption scheme with perfect secrecy
Summary
In the last blog post, I introduced my plan to make the safest cryptographic box in the world, and to make it widely available. This would, in theory, make it possible to run infinitely dangerous programs (including superintelligences) safely. This cryptographic box was supposed to use a scheme that is fully homomorphic and perfectly secret at the same time. However, in the last four months, something has changed:
I discovered that the symmetric Quantum Fully Homomorphic Encryption scheme with perfect secrecy that I was using has a flaw.
I managed to find a partial fix for the scheme and to implement it. I will prove that this fix enables us to safely evaluate every Clifford gate. But this fix still doesn’t enable us to safely evaluate every quantum gate.
However, not all hope is lost. Although this specific scheme, made by Min Liang, has a flaw, there is another QFHE scheme with perfect secrecy made by the same researcher. In the future, I will try to implement this second QFHE scheme instead, hoping that it does not have any flaw.
This post explains in more details these points. I will also raise concerns about whether such QFHE scheme could help malevolent actors hide the training of superintelligences. If you know about any other FHE or QFHE scheme with perfect secrecy than the ones I have cited, please let me know.
Prerequisites to understand this post
In quantum computing, quantum gates are represented by matrices, and quantum states are either represented as vectors or as density matrices. If we want to apply a gate G on a state s in the vector paradigm, we compute G⋅s, whereas if we want to do so in the density matrix paradigm, we compute G⋅s⋅G−1. I will use the density matrix paradigm, as it gives more information, and because it is the paradigm used in the article that introduced this QFHE scheme. To understand this post, we need to know about the following quantum gates and quantum states:
X=(0110), Y=(0−ii0), and Z=(100−1) are called the Pauli gates. The most familiar of these is the X gate, which is a generalization of the classical NOT gate. Indeed, it transforms zeros into ones and ones into zeros. The other two gates do not have any analog gate in classical programming. They are called X, Y, and Z because they flip the state of the qubit 180 degrees along the x, y, and z axes.
CNOT=⎛⎜ ⎜ ⎜⎝1000010000010010⎞⎟ ⎟ ⎟⎠ is a two-qubit gate which applies an X gate on the second qubit if and only if the first one is active. We call it the controlled NOT gate because it applies a NOT gate on the target qubit according to the control qubit. We often say that it is a generalization of the XOR gate, because it sets the target qubit to the XOR between itself and the control qubit. Along with the Hadamard gate H=1√2(111−1) and the gate S=(100i), they form a group of gates called the Clifford group. This group contains the Pauli group.
Rx(θ)=(cos(θ/2)−isin(θ/2)−isin(θ/2)cos(θ/2)), Ry(θ)=(cos(θ/2)−sin(θ/2)sin(θ/2)cos(θ/2)), and Rz(θ)=(e−iθ/200eiθ/2) are called the rotation operator gates. They are generalizations of the Pauli gates: They perform a rotation of θ radians along the x, y, and z axes. They together can form any one-qubit gate.
U(α,β,γ,δ)=eiαRz(β)Ry(γ)Rz(δ) is the universal one-qubit gate, which also can form any one-qubit gate. To do so, it takes four parameters. It can represent any one-qubit gate. Together with CNOT, they form a universal gate set, which means that any quantum operation (and therefore any classical operation) can be performed using only these gates.
Ik is the identity matrix of length k. It leaves everything unchanged.
ek is the vector that contains zeros everywhere except for a 1 at position k, whereas ek,k is the matrix that contains zeros everywhere except for a 1 at position (k,k).
⊗ is the Kronecker product, which converts a k-qubit gate and a l-qubit gate into a (k+l)-qubit gate.
A matrix to the power of zero is equal to the identity matrix.
The scheme
Encryption
If we want to encrypt a message of n bits, we firstly have to transform it into a standard unit vector. For instance, to encrypt the sequence of bits 011, we first convert it into decimal (in which case we get 3), and we then convert it into the vector of 2n elements with zeros everywhere except for a 1 at position 3+1, which is the vector e3+1:
e3+1=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝00010000⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠After that, we convert it into a density matrix that we note ρ (we call it the plaintext). As the message is not a superposition, the density matrix is simply a square matrix whose diagonal is the vector we just computed:
ρ=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝0000000000000000000000000001000000000000000000000000000000000000⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠Then, we select two random sequences a and b, each of length n. These sequences are called the keys. With these sequences, we can now compute the ciphertext, which is:
σ=(Xa1Zb1⊗⋯⊗XanZbn)⋅ρ⋅(Xa1Zb1⊗⋯⊗XanZbn)−1Evaluation of CNOT
To apply a CNOT gate on the i-th bit according to the (i−1)-th bit, we firstly have to compute a matrix CNOT′ like so:
CNOT′=CNOT⋅((−1)ai−1biZbi⊗Xai−1)Then, we update σ according to this formula:
σ←(I2i−2⊗CNOT′⊗I2n−i)⋅σ⋅(I2i−2⊗CNOT′⊗I2n−i)−1Evaluation of the universal one-qubit gate
To apply U(α,β,γ,δ) on the i-th bit, we compute U′(α,β,γ,δ) like so:
U′(α,β,γ,δ)=U(α,(−1)aiβ,(−1)ai+biγ,(−1)aiδ)And we then update σ according to this formula:
σ←(I2i−1⊗U′(α,β,γ,δ)⊗I2n−i)⋅σ⋅(I2i−1⊗U′(α,β,γ,δ)⊗I2n−i)−1Decryption
To get back the plaintext from the ciphertext, we use the following formula:
ρ=(Xa1Zb1⊗⋯⊗XanZbn)⋅σ⋅(Xa1Zb1⊗⋯⊗XanZbn)−1At this point, ρ is the density matrix we were looking for. If the result is not a superposition, then we can transform ρ into a standard unit vector of the form ej+1 by taking the diagonal (If the diagonal is not a standard unit vector, or that the matrix has non-zero elements outside of the diagonal, then the result is not a classical state). Finally, we convert j back to binary to get the initial sequence of bits.
How the scheme looks like in Python
I coded the QFHE scheme in Python. It may help you understand how the scheme is supposed to work. If you are interested, you can find my programs here. One program shows how to implement the Toffoli gate using the scheme, and another one shows how to implement Rule 60, which outputs a Sierpiński triangle (but a very small one because we are trying to simulate a quantum computer inside of a classical computer, which takes a lot of time to compute).
The problem
The problem in the evaluation of CNOT
Suppose we have a ciphertext σ corresponding to a plaintext ρ that contains only classical states. Let’s try to homomorphically apply a CNOT, where the control qubit is ρi and the target qubit is ρj. From this operation, we obtain a ciphertext σ′ corresponding to a plaintext ρ′. σ=σ′ is equivalent to Decrypta,b(σ)=Decrypta,b(σ′), which is equivalent to ρ=ρ′, which is equivalent to ρj=ρ′j, which is equivalent to saying that the target qubit of the CNOT didn’t change, which is equivalent to saying that the control qubit of the CNOT was equal to 0, which is equivalent to ρi=0. Therefore, σ=σ′⟺ρi=0. In other words, by checking whether the ciphertext changed after homomorphically applying a CNOT, we can deduce the value of the control qubit in the plaintext.
The same problem in the evaluation of the universal one-qubit gate
Initially, I thought that the problem arose only for the evaluation of CNOT. For the problem to arise for one-qubit gates, we would need a one-qubit gate G and two states S1 and S2 such that applying G on S1 changes S1, but that applying G on S2 doesn’t change S2. I initially failed to find such a gates, and wrongly thought that it may not exist. However, such gates do exist. For instance, applying the X gate on the |0⟩ state does change it, but applying it on the |+⟩ state doesn’t change it. Therefore, if Alice takes one of these two states randomly, encrypts it, and asks you to homomorphically perform an X gate on it, then you could guess which state she chose by looking at whether the ciphertext changes after evaluating the X gate.
Actually, this problem arises for every one-qubit gate, except for the identity gate. Indeed, since a one-qubit gate corresponds to a rotation of the Bloch sphere, and that every non-null rotation of a sphere has exactly two fixed points, then for every one-qubit gate that is different than the identity gate, there are exactly two inputs that do not change, while the other inputs do change.
A more general version of the problem
Suppose you have a program which, when given the sequence 000, loops back after three timesteps, but which, when given the sequence 111, loops back after two timesteps. Suppose then that Alice took one of these two possible inputs randomly, encrypted it, sent the ciphertext to you, and then asked you to homomorphically run the program on the ciphertext. When running the program, you will see that the ciphertext will loop either every two timesteps or every three timesteps. From this observation, you will be able to guess whether the input Alice took was 000 or 111. That is, you learned something from the plaintext by just applying the evaluation phase.
It turns out that this problem is a generalization of the first one: The first problem is the specific case where the program loops back instantly according to one input, but loops back after two timesteps according to another input.
What we would need to solve this problem
When the plaintext/program loops back, we would want the ciphertext not to loop back too. In other words, we would want “same plaintext → different ciphertexts” to be possible. To do this, we need a key update. But, which key update do we need? To answer this question, I will show in the next section that, in classical computing, the XOR gate can be evaluated safely. By “evaluated safely”, I mean that no one involved in the computation can learn anything about the program, unless they cheat by communicating to each other.
How we would solve the problem in classical computing
The XOR gate can be evaluated safely
Suppose that Phoebe has a plaintext, that she generates a random key of identical length, and that she XORs the two to get a ciphertext (this is the One-time pad that was discussed it in the last post). Then, she sends the key to Kevin and the ciphertext to Charlie. In this way, neither Kevin and Charlie can have any information about the plaintext. After a while, Kevin and Charlie give back the key and the ciphertext, that they modified without communicating to each other or to Phoebe. Phoebe then XORs the two results to get a new plaintext. Now the question is, is there a scheme that Kevin and Charlie can follow in order to compute any function in this way?
I do not know whether this is possible. But it is very easy to see that at least one function is computable in this way. The most obvious example is the identity function, where Kevin and Charlie just do nothing. However, are there other functions? The answer is yes. Actually, every circuit C that consist only of XOR gates can can be computed in this way. To do so, Kevin and Charlie just need to apply C on the key and the ciphertext. For instance, in this picture, Kevin and Charlie help Phoebe to compute a circuit that consists of a single XOR gate:
This shows that it is possible to run any infinitely dangerous program safely as long as they consist only of XOR gates. Indeed, if the program could affect Kevin or Charlie, then they would have gained information about the plaintext, which would contradict the perfect secrecy of the One-time pad.
Are there other classical gates evaluable in this way?
By looking at the picture closely, we can see that asking ourselves whether we can evaluate a function F in this way is equivalent to asking ourselves whether there exist two functions Kevin and Charlie such that:
Kevin(k1,k2)⊕Charlie(k1⊕ρ1,k2⊕ρ2)=F(ρ1,ρ2)If we assume that Kevin and Charlie are classical circuits, then the only classical circuits that we can compute in this way are the ones that can be built using only XOR gates (Technically, I considered only two-bit and three-bit programs, but it seems very unlikely to me that adding more bits would change the result). To show this, I created a program that brute-forces the problem:
This program outputs every two-bit functions that Kevin and Charlie can compute together. However, it turns out that all of those programs are exactly those that can be built using only the XOR gate.
This is unfortunate for us, because the XOR gate isn’t a universal gate. However, not all hope is lost, as we focused only on classical computing. Is there a way to compute more functions than that using quantum computing?
I will now show that a similar technique can be applied for the QFHE scheme. However, there are still a few differences. For instance, in the QFHE scheme, we have two keys instead of one.
Which quantum gates can be evaluated safely without ancilla qubits?
An algorithm to know whether a one-qubit gate can be evaluated safely without ancilla qubits
Suppose, without loss of generality, that our ciphertext has a single qubit. We would want to safely evaluate a one-qubit gate G without any ancilla qubit. Kevin will be given the two keys, which are the two bits a1 and b1, whereas Charlie will be given the ciphertext, which is a qubit σ. Kevin and Charlie will have to update (a1,b1) and σ respectively, using two functions f,g:B×B→B and a quantum gate G′ like so:
σ←G′⋅σ⋅G′−1a1,b1←f(a1,b1),g(a1,b1)For the evaluation to be valid, it should be the case that encrypting a plaintext ρ, then applying the update, and then decrypting it gives us the right plaintext. In other words, it should be the case that, for all (a1,b1)∈B2 and all ρ, this equation holds:
(Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1)⋅ρ⋅(Xa1Zb1)−1⋅G′−1⋅(Xf(a1,b1)Zg(a1,b1))−1=G⋅ρ⋅G−1Firstly, given G′, f and g, how can we verify whether the equation holds for every (a1,b1) and ρ? We cannot enumerate all the possible ρ, as there is an infinite amount of them. We could also try a lot of different ρ, but this would take a while, and we may still miss some rare exceptions. To solve this problem, I will firstly have to show that for every gate A and B, we have ACA−1=BCB−1 for every state C if and only if there exists λ such that A=λB. (This post initially had a proof that worked only for 2×2 reversible matrices. Thankfully, one of my classmates, Julia PHAM BA NIEN, gave me a proof that works for every reversible matrix, which is now included here).
ACA−1=BCB−1⟺B−1ACA−1A=B−1BCB−1A⟺(B−1A)C=C(B−1A)Therefore, B−1A has to commute with every other matrices. Let’s prove that only matrices of the form λI do so. Firstly, it is trivial that every matrix of the form λI commutes with every other matrices, as (λI)C=λC=λCI=C(λI). Now, let’s prove that every matrix that commutes with every other matrices are of the form λI. Let P be a matrix that commutes with every other matrices. Note in particular that, for every i, we have:
Pe1,i=e1,iP⟺⎛⎜ ⎜ ⎜⎝α1,1α1,2…α2,1⋱⋮⎞⎟ ⎟ ⎟⎠e1,i=e1,i⎛⎜ ⎜ ⎜⎝α1,1α1,2…α2,1⋱⋮⎞⎟ ⎟ ⎟⎠⟺⎛⎜ ⎜⎝0…α1,1…0⋮⋮⋮0…αn,1…0⎞⎟ ⎟⎠=⎛⎜ ⎜⎝αi,1…αi,i…αi,n⋮⋮⋮0…0…0⎞⎟ ⎟⎠Therefore, for every i and j, we have αi,j=0 if i ≠j, and αi,j=α1,1 if i=j. Therefore, all non-diagonal elements are equal to zero, and all diagonal elements are pairwise equal. As such, P is of the form λI. In our case, we therefore have B−1A=λI, and therefore A=λB, which concludes the proof.
This property is useful to us, because our equation can be rewritten in the form ACA−1=BCB−1 like so:
((Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1))⋅ρ⋅((Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1))−1=G⋅ρ⋅G−1Therefore, given G′, f and g, we can verify whether the above equation holds for every (a1,b1) and ρ by checking that, for every (a1,b1), there exists λ such that:
((Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1))=λGNow that we know how to test in finite time whether a given G′, f and g enable us to safely evaluate G, how can we find such G′, f and g? Since there are infinitely many quantum gates, we cannot enumerate all the possible gates G′. However, we can do better than this. Indeed, if we look at the equation that we deduced, we have:
((Xf(a1,b1)Zg(a1,b1))⋅G′⋅(Xa1Zb1))=λG⟺G′=(Xf(a1,b1)Zg(a1,b1))−1⋅λG⋅(Xa1Zb1)−1Therefore, to verify whether G can be evaluated safely, we can do as follows: For every function f and g, we try to see whether, for all (a1,b1), the result of (Xf(a1,b1)Zg(a1,b1))−1⋅G⋅(Xa1Zb1)−1 stays the same up to a scalar (as we removed λ). If we find such f and g, then it means that we can safely evaluate G. After finding f and g, we can choose any G′ that is equal to (Xf(a1,b1)Zg(a1,b1))−1⋅G⋅(Xa1Zb1)−1 up to a scalar. Here, the choice of (a1,b1) shouldn’t matter when we generate G′ (more precisely, we may get different G′, but they would all work, as they would differ only by a scalar).
What happens if we don’t find such f and g? It would mean that, either the gate cannot be evaluated safely, or that this gate needs more than one qubit. It may be that the second scenario never happens, and that if we don’t find such f and g, then this gate is out of reach forever, whether or not we use ancilla qubits. However, I haven’t investigated this yet.
Combinatorial explosion of generalizing this algorithm for n-qubit gates
The algorithm shown above can theoretically be generalized for n-qubit gates. For every (f1,…,fn) and (g1,…,gn), we look at whether, for all a=(a1,…,an) and b=(b1,…,bn), this stays the same up to a scalar:
(Xf1(a,b)Zg1(a,b)⊗⋯⊗Xfn(a,b)Zgn(a,b))−1⋅G⋅(Xa1Zb1⊗⋯⊗XanZbn)−1If we find such (f1,…,fn) and (g1,…,gn), we can choose any G′ that is equal to this up to a scalar.
However, this program suffers from combinatorial explosion. Indeed, even for 2-qubit gates, we need to look at every quadruple of functions (f1,f2,g1,g2) that take four parameters each. That is, we need to look at (2(24))4=264=18446744073709600000 quadruples of functions.
So, this algorithm cannot be used in practice for two-qubit gates. However, it works fine for one-qubit gates. Indeed, I have implemented this algorithm in Python for one-qubit gates, and it has proven useful to me when writing the next sections.
Every Clifford gate can be evaluated safely
The Clifford group is the group of gates that can be made from {CNOT,H,S}. It is not universal, but it contains useful gates like the Pauli group that can be made from{X,Y,Z}. Using the algorithm of the last section, we can prove that H and S can be evaluated safely. Finding a way to safely evaluate the CNOT gate is a bit harder, as we cannot use the algorithm because of combinatorial explosition. However, with a bit of math and brute-force, I managed to find a safe evaluation of CNOT. Therefore, every Clifford gate can be evaluated safely. Although the algorithm finds many different ways to evaluate H and S, I will present only the ones that I consider the simplest.
Evaluation of CNOT
Instead of computing CNOT′, we will instead update σ, a and b like so:
σ←(I2i−2⊗CNOT⊗I2n−i)⋅σ⋅(I2i−2⊗CNOT⊗I2n−i)−1ai←ai−1⊕aibi−1←bi−1⊕biHere is the reason why this will give us the right result: Without loss of generality, let’s assume that our initial sequence has only two bits. We would want to encrypt ρ, apply a CNOT on it in this way, and decrypt it. In that case, the plaintext we will obtain is:
(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)⋅CNOT⋅(Xa1Zb1⊗Xa2Zb2)⋅ρ⋅(Xa1Zb1⊗Xa2Zb2)−1⋅CNOT−1⋅(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)−1=(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)⋅(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)⋅CNOT⋅ρ⋅CNOT−1⋅(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)−1⋅(Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)−1=CNOT⋅ρ⋅CNOT−1Which is the plaintext that we wanted. The first equality holds because CNOT⋅(Xa1Zb1⊗Xa2Zb2) is equal to (Xa1Zb1⊕b2⊗Xa1⊕a2Zb2)⋅CNOT. I found this result by creating a program that brute-forces the problem, which you can see here. The second equality holds because applying twice the same sequence of X and Z on a quantum state yields back the same quantum state.
Evaluation of H
To apply a H gate on the i-th qubit, we update σ, a and b like so:
σ←(I2i−1⊗H⊗I2n−i)⋅σ⋅(I2i−1⊗H⊗I2n−i)−1ai,bi←bi,aiIndeed, without loss of generality, suppose that our plaintext ρ has a single qubit. In that case, when we encrypt it, apply a H gate on it in this way, and decrypt it, then the plaintext we obtain is:
(Xb1Za1)⋅H⋅(Xa1Zb1)⋅ρ⋅(Xb1Za1)−1⋅H−1⋅(Xa1Zb1)−1=H⋅ρ⋅H−1The equality holds because (Xb1Za1)⋅H⋅(Xa1Zb1)=H for all (a1,b1)∈B. This technique was firstly proposed by Andrew Childs in this article in a slightly different context.
Evaluation of S
To apply a S gate on the i-th qubit, we do:
σ←(I2i−1⊗S⊗I2n−i)⋅σ⋅(I2i−1⊗S⊗I2n−i)−1bi←ai⊕biNow, to prove that the evaluation of S gives the right result, we ask ourselves whether encrypting a qubit, then applying S on it in the way shown above, and then decrypting it gives us the right plaintext. In other words, we ask ourselves whether this equation holds:
((Xa1Za1⊕b1)⋅S⋅(Xa1Zb1))⋅ρ⋅((Xa1Za1⊕b1)⋅S⋅(Xa1Zb1))−1=S⋅ρ⋅S−1And indeed, this equation holds because, for all (a1,b1), ((Xa1Za1⊕b1)⋅S⋅(Xa1Zb1)) differs from S by only a scalar.
The T gate cannot be evaluated safely without ancilla qubits
Earlier, I said that the Clifford group made from {H,S,CNOT} is not universal. However, if we augment this group by adding the T gate, then the group becomes universal. Therefore, it may be interesting to see whether the T gate can be evaluated safely, as this would enable us to evaluate safely every program.
Unfortunately, the algorithm doesn’t find any way to safely evaluate the T gate. Because of the way that the algorithm was made, this proves that the T gate cannot be safely evaluated without ancilla qubits.
Can the T gate be safely evaluated with ancilla qubits? As Min Liang has made another QFHE scheme with perfect secrecy which uses ancilla qubits for the T gate, it may be possible, as long as this QFHE scheme doesn’t contain any flaw. I haven’t investigated this yet, and this is my next step in my plan.
Concerns about bad actors hiding the training of superintelligences
Without FHE with perfect secrecy, bad actors could hide the training of superintelligences by making air-gapped computers hidden from the public. They may be caught in this process, or they may not have enough resources. However, if they had an efficient FHE scheme with perfect secrecy, they could directly use the computing power of the Cloud, which is way easier to do, and which would make them able to use way more computing power. This would help them avoid getting caught by authorities, and therefore would help them bypass any safety regulation.
After a few days of thought, I concluded that the problem does not arise from the fact that the QFHE scheme I try to develop is perfectly secret. Indeed, although FHE schemes without perfect secrecy aren’t safe enough in order to control misaligned superintelligences, they are safe enough to avoid getting broken by authorities.
Instead, the problem is that the FHE scheme I try to develop has a low time complexity. Indeed, current FHE schemes have a very high time complexity, which makes them useless in order to hide the training of superintelligences.
However, the high time complexity of current FHE schemes can already be partially bypassed by bad actors, by using only partially homomorphic encryption schemes or FHE schemes that work well for very specific computations. These schemes can have a time complexity low enough in order to run neural networks, where the most famous example is this article.
I started getting worried since the moment I first thought about the possibility that bad actors would try to hide their attempts to train superintelligences. I started thinking about whether I should stop my research. I am still unsure about whether it causes any harm, as the problem also somewhat arises from partially homomorphic encryption schemes. The way that my plan, if achieved, could help bad actors is by making the encryption of programs automatic (there is no need to change the code or the architecture of the superintelligence, as long as it is written for quantum computers), and making their evaluation faster because of a lower time complexity.
Personally, I think that bad actors will mostly be caught before starting to train their superintelligence. Indeed, it seems very likely that developing the source code of the superintelligence needs way more people and time than the superintelligence’s training. However, helping bad actors not to be caught during the training is still a significant help for them, and therefore it is a significant harm for everyone else.
Overall, I think that bad actors already have other ways to hide the training of superintelligences while using the computing power of the Cloud, but that QFHE schemes may have a significant chance to help them in this process. I do not know whether the safety that the QFHE provides against superintelligences is outweighted by the risks of bad actors hiding the training of superintelligences. I hope not, but if most people think that my research is net negative, I will stop it.
Conclusion
In this post, we have seen that the symmetric QFHE scheme had a flaw that enables us to obtain information about the plaintext by just making computations on the ciphertext. To prevent this, we added a key update that is independent of the ciphertext, and we made the ciphertext update independent of the key. This enables us to safely evaluate any program made out of Clifford gates, which means that no party involved can have any information about the plaintext, unless they communicate to each other (which is forbidden). However, this fix doesn’t work for the T gate, which means that if this gate can be evaluated safely, then it needs at least one ancilla qubit. I also raised my concerns about whether QFHE schemes could help malevolent actors train superintelligences without being detected by authorities. In the future, I will try to find whether there are non-Clifford gates that can be evaluated safely.