Question: if I’m considering an isolated system (~= “the entire universe”), you say that I can swap between state-vector-format and matrix-format via
|ϕ⟩↔ρ=|ϕ⟩⟨ϕ|
. But later, you say...
If HS is uncoupled to its environment (e.g. we are studying a carefully vacuum-isolated system), then we still have to replace the old state vector picture |ϕ⟩∈H by a (possibly rank >1) density matrix ρ∈OpH…
But if ρ:=|ϕ⟩⟨ϕ|, how could it ever be rank>1?
(Perhaps more generally: what does it mean when a state is represented as a rank>1 density matrix? Or: given that the space of possible ρs is much larger than the space of possible |ϕ⟩s, there are sometimes (always?) multiple ρs that correspond to some particular |ϕ⟩; what’s the significance of choosing one versus another to represent your system’s state?)
The usual story about where rank > 1 density matrices come from is when your subsystem is entangled with an environment that you can’t observe.
The simplest example is to take a Bell state, say
|00> + |11> (obviously I’m ignoring normalization) and imagine you only have access to the first qubit; how should you represent this state? Precisely because it’s entangled, we know that there is no |Psi> in 1-qubit space that will work. The trace method alluded to in the post is to form the (rank-1) density matrix of the Bell state, and then “trace out” the second system; if you think of the density matrix as living in M_2 tensor M_2, this means applying the trace operator just to the right side of the tensor, i.e. mapping matrix units E_ij tensor E_kl to delta_kl E_ij and then extending by linearity.
You can check that for this example you get the (normalized) 2x2 identity matrix.
You can think of this tracing out process as a quantum version of marginalization. To get a feel for it intuitively, it’s useful to consider the following: suppose you are given access to an endless supply of one-of-a-Bell-pair qubits, and you make repeated measurements, what will you see?
It’s pretty clear that if you measure in the standard basis, you’ll have a 50⁄50 chance of measuring |0> or |1>. This is the sort of thing a first-timer might pattern match to an equal superposition but that’s not correct, no matter what basis you measure in you’ll obtain 50/50--this is because
conceptually: you’re measuring half an entangled state so the whole point is it can’t yield a given state with certainly under measurement
mathematically: the Bell state can be written as
|xx> + |yy> for any orthogonal states |x>, |y> , there’s nothing special about the standard basis. So in any measurement, the entangled state has an equal chance of both being x and both being y, so sometime who can only see one qubit will see an equal chance of x and of y
formalism-ly (?): the rule for calculating measurement probabilities, |<x, y>|^2 = <x|(|y><y|)|x> where y is your state, and x is the state whose probability after measurement you wish to know, generalizes obviously to <x|rho|x> for any density matrix; in our case rho is a multiple of the identity, and all states are norm 1, so all potential measurement outcomes yield the same probability.
The point about the Lindbladian is that it’s pretty generic for rank-1 states to evolve to higher rank mixed states; it’s basically the same idea as decoherence: you entangle with the rest of the world, but then lose track of all the precise degrees of freedom, so you only really see a small subsystem of a large entangled state.
Indeed it’s true a given high rank density matrix can have multiple purifications—rank-1 states of which it is the traced out part corresponding to one subsystem—but that’s to be expected, in this point of view: if we had perfect knowledge of the whole system, including everything our subsystem had ever become entangled with, we’d use a regular, pure state. The use of a mixed, higher rank density matrix corresponds to our loss of information to the “environment”. And yes, the rank of the density matrix is related to the minimal dimension of the Hilbert space needed for an environment to purify your density matrix.
Another way to think about higher-rank density matrices is as probability distributions over pure states; I think this is what Charlie Steiner’s comment is alluding to.
So, the rank-2 matrix from my previous comment, ρ=12I2 can be thought of as12|0⟩⟨0|+12|1⟩⟨1|
, i.e., an equal probability of observing each of |0⟩,|1⟩. And, because I2=|x⟩⟨x|+|y⟩⟨y| for any orthonormal vectors |x⟩,|y⟩, again there’s nothing special about using the standard basis here (this is mathematically equivalent to the argument I made in the above comment about why you can use any basis for your measurement).
I always hated this point of view; it felt really hacky, and I always found it ugly and unmotivated to go from states |Ψ⟩ to projections |Ψ⟩⟨Ψ| just for the sake of taking probability distributions.
The thing above about entanglement and decoherence, IMO, is a more elegant and natural way to see why you’d come up with this formalism. To be explicit, suppose you have the state |0⟩, and there is an environment state that you don’t have access to, say it also begins in state |0⟩, and initially everything is unentangled, so we begin in the state |00⟩. Then some unitary evolution happens that entangles us, say it takes |00⟩ to the Bell state |00⟩+|11⟩√2.
As we’ve seen, you should think of your state as being 12I2, and now it’s clear why this is the right framework for probabilistic mixtures of quantum states: it’s entirely natural to think of your part of the now-entangled system to be “an equal chance of |0⟩ and |1⟩”, and this indeed gives us the right density matrix. It also immediately implies that you are forced to also allow that it could be represented as “an equal chance of |+⟩ and |−⟩” where |+⟩,|−⟩=|0⟩±|1⟩√2, and etc.
But it makes it clear why we have this non-uniqueness of representation, or where the missing information went: we don’t just “have a probabilistic mixture of quantum states”, we have a small part of a big quantum system that we can’t see all of, so the best we can do is represent it (non-uniquely) as a probabilistic mixture of quantum states.
Now, you aren’t obliged to take this view, that the only reason we have any uncertainty about our quantum state is because of this sort of decoherence process, but it’s definitely a powerful idea.
Yeah, this also bothered me. The notion of “probability distribution over quantum states” is not a good notion: the matrix I is both (|0\rangle \langle 0|+|1\rangle \langle 1|) and (|a\rangle \langle a|+|b\rangle \langle b|) for any other orthogonal basis. The fact that these should be treated equivalently seems totally arbitrary. The point is that density matrix mechanics is the notion of probability for quantum states, and can be formalized as such (dynamics of informational lower bounds given observations). I was sort of getting at this with the long “explaining probability to an alien” footnote, but I don’t think it landed (and I also don’t have the right background to make it precise)
Question: if I’m considering an isolated system (~= “the entire universe”), you say that I can swap between state-vector-format and matrix-format via
|ϕ⟩↔ρ=|ϕ⟩⟨ϕ|. But later, you say...
But if ρ:=|ϕ⟩⟨ϕ|, how could it ever be rank>1?
(Perhaps more generally: what does it mean when a state is represented as a rank>1 density matrix? Or: given that the space of possible ρs is much larger than the space of possible |ϕ⟩s, there are sometimes (always?) multiple ρs that correspond to some particular |ϕ⟩; what’s the significance of choosing one versus another to represent your system’s state?)
The usual story about where rank > 1 density matrices come from is when your subsystem is entangled with an environment that you can’t observe.
The simplest example is to take a Bell state, say
|00> + |11> (obviously I’m ignoring normalization) and imagine you only have access to the first qubit; how should you represent this state? Precisely because it’s entangled, we know that there is no |Psi> in 1-qubit space that will work. The trace method alluded to in the post is to form the (rank-1) density matrix of the Bell state, and then “trace out” the second system; if you think of the density matrix as living in M_2 tensor M_2, this means applying the trace operator just to the right side of the tensor, i.e. mapping matrix units E_ij tensor E_kl to delta_kl E_ij and then extending by linearity.
You can check that for this example you get the (normalized) 2x2 identity matrix.
You can think of this tracing out process as a quantum version of marginalization. To get a feel for it intuitively, it’s useful to consider the following: suppose you are given access to an endless supply of one-of-a-Bell-pair qubits, and you make repeated measurements, what will you see?
It’s pretty clear that if you measure in the standard basis, you’ll have a 50⁄50 chance of measuring |0> or |1>. This is the sort of thing a first-timer might pattern match to an equal superposition but that’s not correct, no matter what basis you measure in you’ll obtain 50/50--this is because
conceptually: you’re measuring half an entangled state so the whole point is it can’t yield a given state with certainly under measurement
mathematically: the Bell state can be written as
|xx> + |yy> for any orthogonal states |x>, |y> , there’s nothing special about the standard basis. So in any measurement, the entangled state has an equal chance of both being x and both being y, so sometime who can only see one qubit will see an equal chance of x and of y
formalism-ly (?): the rule for calculating measurement probabilities, |<x, y>|^2 = <x|(|y><y|)|x> where y is your state, and x is the state whose probability after measurement you wish to know, generalizes obviously to <x|rho|x> for any density matrix; in our case rho is a multiple of the identity, and all states are norm 1, so all potential measurement outcomes yield the same probability.
The point about the Lindbladian is that it’s pretty generic for rank-1 states to evolve to higher rank mixed states; it’s basically the same idea as decoherence: you entangle with the rest of the world, but then lose track of all the precise degrees of freedom, so you only really see a small subsystem of a large entangled state.
Indeed it’s true a given high rank density matrix can have multiple purifications—rank-1 states of which it is the traced out part corresponding to one subsystem—but that’s to be expected, in this point of view: if we had perfect knowledge of the whole system, including everything our subsystem had ever become entangled with, we’d use a regular, pure state. The use of a mixed, higher rank density matrix corresponds to our loss of information to the “environment”. And yes, the rank of the density matrix is related to the minimal dimension of the Hilbert space needed for an environment to purify your density matrix.
Actually, I have a little more to say:
Another way to think about higher-rank density matrices is as probability distributions over pure states; I think this is what Charlie Steiner’s comment is alluding to.
So, the rank-2 matrix from my previous comment, ρ=12I2 can be thought of as12|0⟩⟨0|+12|1⟩⟨1|
, i.e., an equal probability of observing each of |0⟩,|1⟩. And, because I2=|x⟩⟨x|+|y⟩⟨y| for any orthonormal vectors |x⟩,|y⟩, again there’s nothing special about using the standard basis here (this is mathematically equivalent to the argument I made in the above comment about why you can use any basis for your measurement).
I always hated this point of view; it felt really hacky, and I always found it ugly and unmotivated to go from states |Ψ⟩ to projections |Ψ⟩⟨Ψ| just for the sake of taking probability distributions.
The thing above about entanglement and decoherence, IMO, is a more elegant and natural way to see why you’d come up with this formalism. To be explicit, suppose you have the state |0⟩, and there is an environment state that you don’t have access to, say it also begins in state |0⟩, and initially everything is unentangled, so we begin in the state |00⟩. Then some unitary evolution happens that entangles us, say it takes |00⟩ to the Bell state |00⟩+|11⟩√2.
As we’ve seen, you should think of your state as being 12I2, and now it’s clear why this is the right framework for probabilistic mixtures of quantum states: it’s entirely natural to think of your part of the now-entangled system to be “an equal chance of |0⟩ and |1⟩”, and this indeed gives us the right density matrix. It also immediately implies that you are forced to also allow that it could be represented as “an equal chance of |+⟩ and |−⟩” where |+⟩,|−⟩=|0⟩±|1⟩√2, and etc.
But it makes it clear why we have this non-uniqueness of representation, or where the missing information went: we don’t just “have a probabilistic mixture of quantum states”, we have a small part of a big quantum system that we can’t see all of, so the best we can do is represent it (non-uniquely) as a probabilistic mixture of quantum states.
Now, you aren’t obliged to take this view, that the only reason we have any uncertainty about our quantum state is because of this sort of decoherence process, but it’s definitely a powerful idea.
Yeah, this also bothered me. The notion of “probability distribution over quantum states” is not a good notion: the matrix I is both (|0\rangle \langle 0|+|1\rangle \langle 1|) and (|a\rangle \langle a|+|b\rangle \langle b|) for any other orthogonal basis. The fact that these should be treated equivalently seems totally arbitrary. The point is that density matrix mechanics is the notion of probability for quantum states, and can be formalized as such (dynamics of informational lower bounds given observations). I was sort of getting at this with the long “explaining probability to an alien” footnote, but I don’t think it landed (and I also don’t have the right background to make it precise)
Ahhh! Yes, this is very helpful! Thanks for the explanation.