I’m very confused by the mathematical setup. Probably it’s because I’m a mathematician and not a physicist, so I don’t see things that would be clear for a physicists. My knowledge of quantum mechanics is very very basic, but nonzero. Here’s how I rewrote the setup part of your paper as I was going along, I hope I got everything right.
You have a system S which is some (seperable, complex, etc..) Hilbert space. You also have an observer system O (which is also a Hilbert space). Elements of various Hilbert spaces are called “states”. Then you have the joint system S⊗O of which Ψ is an element of, which comes with a (unitary) time-evolution E:S⊗O→S⊗O. Now if S were not being observed, it would evolve by some (unitary) time-evolution ES:S→S. We assume (though I think functional analysis gives this to use for free) that (vi)i is an orthonormal basis of eigenfunctions of ES, with eigenvalues (λi)i.
Ok, now comes the trick: we assume that observation doesn’t change the system, i.e. that the S-component of E is ES. Wait, that doesn’t make sense!E doesn’t have an “S-component”, something like an S-component makes sense only for pure states, if you have mixed states then the idea breaks down. Ok, so we assume that E, when acting on pure states, is equal to ES. So this would give E:φ⊗ψ↦(ESφ)⊗ψφ, where ψφ is defined so that this holds. Presumably something goes wrong if we do this, so we instead require the weaker E:φi⊗ψ↦(ESφi)⊗ψi. And bingo! Since the φi are eigenfunctions, we get that E(φi⊗ψ)=φi⊗(λiψi), and let’s redefine ψi to include the λi term because why not. Now, if we extend by linearity we get that E:ϕ⊗ψ↦∑iaiφi⊗ψi. Applying E again gives ∑iaiφi⊗(ψi)i, and the same for further powers.
Ok, let’s interpret that last part in terms of “observations”. If we take states of the combined system S⊗E, then time-evolution maps pure states with only a vi component to pure states with only a vi component. Wait, that’s exactly what we assumed, why should we be surprised? Well yeah, but if you started out with some linear combination of eigenfunctions, these will be mapped to a linear combination of pure states, and each pure state in this linear combination evolves as assumed, which may or may not be abig deal to you. In a mixed state that is a linear combination of pure states, we call each pure state a “separate observer” or something like this. Of course, mixed states in a tensor product state cannot be uniquely be written as a sum of pure states. However, if we take our preferred basis (vi)i and express our mixed states as pure states with respect to that basis in the S-component, this again makes sense.
So it’s super important that we have already distinguished the eigenfunctions of ES at the start, we unfortunately don’t get them out “naturally”. But I guess we learn something about consistency, in the sense that “if eigenfunctions are important, then eigenfunctions are important”.
Ok, now assume our system S is itself a tensor-product of N subsystems S1⊗⋯⊗SN, which we think of as “repeating a measurement”. Now what we get if we start with some pure-state is (in general) a mixed state which can be written as a linear combination of pure states of eigenfunctions. As the eigenfunctions of the different systems are different (they are elements of different spaces), if you start out with some non-eigenfunction in each subsystem, you’ll end up with some mixed state that contains different eigenfunctions for the different systems. The “derivation of the Born rule” doesn’t need this step with multiple systems. Basically, we can see this already with just one system. If we start with a non-eigenfunction ∑iaiφi, then this gets mapped to some linear combination of pure states via the time-evolution. As the time-evolution is unitary, and the |a_i|^2 sum to 1, we can see that each pure state has “length” |a_i|^2.
Thanks for the great paper! I think I’ve finally understood the Everett interpretation.I think the basic point is that if you start by distinguishing your eigenfunctions, then you naturally get out distinguished eigenfunctions. Which is kind of disappointing, because the fact that eigenfunctions are so important is what I find weirdest about QM. I mean I could accept that the Schrödinger equation gives the evolution of the wave-function, but why care about its eigenfunctions so much?
I mean I could accept that the Schrödinger equation gives the evolution of the wave-function, but why care about its eigenfunctions so much?
I’m not sure if this will be satisfying to you but I like to think about it like this:
Experiments show that the order of quantum measurements matters. The mathematical representation of the physical quantities needs to take this into account. One simple kind of non-commutative objects are matrices.
If physical quantities are represented by matrices, the possible measurement outcomes need to be encoded in there somehow. They also need to be real. Both conditions are satisfied by the eigenvalues of self-adjoint matrices.
Experiments show that if we immediately repeat a measurement, we get the same outcome again. So if eigenvalues represent measurement outcomes the state of the system after the measurement must be related to them somehow. The eigenvectors of the matrix representing this state is a simple realization of this.
This isn’t a derivation but it makes the mathematical structure of QM somewhat plausible to me.
Right, but (before reading your post) I had assumed that the eigenvectors somehow “popped out” of the Everett interpretation. But it seems like they are built in from the start. Which is fine, it’s just deeply weird. So it’s kind of hard to say whether the Everett interpretation is more elegant. I mean in the Copenhagen interpretation, you say “measuring can only yield eigenvectors” and the Everett interpretation, you say “measuring can only yield eigenvectors and all measurements are done so the whole thing is still unitary”. But in the end even the Everett interpretation distinguishes “observers” somehow, I mean in the setup you describe there isn’t any reason why we can’t call the “state space” the observer space and the observer “the system being studied” and then write down the same system from the other point of view...
The “symmetric matrices<-> real eigenvectors” is of course important, this is essentially just the spectral theorem which tells us that real linear combinations of orthogonal projections are symmetric matrices (and vice versa).
Nowadays matrices are seen as “simple non-commutative objects”. I’m not sure if this was true when QM was being developed. But then again, I’m not really sure how linear QM “really” is. I mean all of this takes place on vectors with norm 1 (and the results are invariant under change of phase), and once we quotient out the norm, most of the linear structure is gone. I’m not sure what the correct way to think about the phase is. On one hand, it seems like a kind of “fake” unobservable variable and it should be permissible to quotient it out somehow. On the other hand, the complex-ness of the Schrödinger equation seems really important. But is this complexness a red herring? What goes wrong if we just take our “base states” as discrete objects and try to model QM as the evolution of probability distributions over ordered pairs of these states?
Right, but (before reading your post) I had assumed that the eigenvectors somehow “popped out” of the Everett interpretation.
This is a bit of a tangent but decoherence isn’t exclusive to the Everett interpretation. Decoherence is itself a measurable physical process independent of the interpretation one favors. So explanations which rely on decoherence are part of all interpretations.
I mean in the setup you describe there isn’t any reason why we can’t call the “state space” the observer space and the observer “the system being studied” and then write down the same system from the other point of view...
In the derivations of decoherence you make certain approximations which loosely speaking depend on the environment being big relative to the quantum system. If you change the roles these approximations aren’t valid any more. I’m not sure if we are on the same page regarding decoherence, though (see my other reply to your post).
What goes wrong if we just take our “base states” as discrete objects and try to model QM as the evolution of probability distributions over ordered pairs of these states?
You might be interested in Lucien Hardy’s attempt to find a more intuitive set of axioms for QM compared to the abstractness of the usual presentation: https://arxiv.org/abs/quant-ph/0101012
Isn’t the whole point of the Everett interpretation that there is no decoherence? We have a Hilbert space for the system, and a Hilbert space for the observer, and a unitary evolution on the tensor product space of the system. With these postulates (and a few more), we can start with a pure state and end up with some mixed tensor in the product space, which we then interpret as being “multiple observers”, right? I mean this is how I read your paper.
We are surely not on the same page regarding decoherence, as I know almost nothing about it :)
The arxiv-link looks interesting, I should have a look at it.
Yes, the coherence-based approach (Everett’s original paper, early MWI) is quite different to the decoherence-based approach (Dieter Zeh, post 1970).
Deutsch uses the coherence based approach, while most other many worlders use the decoherence based approach.
He absolutely does establish that quantum computing is superior to classical computing, that underlying reality is not classical, and that the superiority of quantum computing requires some extra structure to reality. What the coherence based approach does not establish is whether the extra structure adds up to something that could be called “alternate worlds” or parallel universes , in the sense familiar from science fiction.
In the coherence based approach, Worlds” are coherent superpositions.That means they in exist at small scales, they can continue to interact with each other, after, “splitting” , and they can be erased. These coherent superposed states are the kind of “world” we have direct evidence for, although they seem to lack many of the properties requited for a fully fledged many worlds theory, hence the scare quotes.
In particular, if you just model the wave function, the only results you will get represent every possible outcome. In order to match observation , you will have to keep discarding unobserved outcomes and renormalising as you do in every interpretation. It’s just that that extra stage is performed manually, not by the programme.
I don’t know if it would make things clearer, but questions about why eigenvectors of Hermitian operators are important can basically be recast as one question of why orthogonal states correspond to mutually exclusive ‘outcomes’. From that starting point, projection-valued measures let you associate real numbers to various orthogonal outcomes, and that’s how you make the operator with the corresponding eigenvectors.
As for why orthogonal states are important in the first place, the natural thing to point to is the unitary dynamics (though there are also various more sophisticated arguments).
Yes, I know all of this, I’m a mathematician, just not one researching QM. The arxiv link looks interesting, but I have no time to read it right now. The question isn’t “why are eigenvectors of Hermitian operators interesting”, it is “why would we expect a system doing something as reasonable as evolving via the Schrödinger equation to do something as unreasonable as to suddenly collapse to one of its eigenfunctions”.
I guess I don’t understand the question. If we accept that mutually exclusive states are represented by orthogonal vectors, and we want to distinguish mutually exclusive states of some interesting subsystem, then what’s unreasonable with defining a “measurement” as something that correlates our apparatus with the orthogonal states of the interesting subsystem, or at least as an ideal form of a measurement?
I think my question isn’t really well-defined. I guess it’s more along the lines of “is there some ‘natural seeming’ reasoning procedure that gets me QM ”.
And it’s even less well-defined as I have no clear understanding of what QM is, as all my attempts to learn it eventually run into problems where something just doesn’t make sense—not because I can’t follow the math, but because I can’t follow the interpretation.
If we accept that mutually exclusive states are represented by orthogonal vectors, and we want to distinguish mutually exclusive states of some interesting subsystem, then what’s unreasonable with defining a “measurement” as something that correlates our apparatus with the orthogonal states of the interesting subsystem, or at least as an ideal form of a measurement?
Yes, this makes sense, though “mutually exclusive state are represented by orthogonal vectors” is still really weird. I kind of get why Hermitian operators here makes sense, but then we apply the measurement and the system collapses to one of its eigenfunctions. Why?
I kind of get why Hermitian operators here makes sense, but then we apply the measurement and the system collapses to one of its eigenfunctions. Why?
If I understand what you mean, this is a consequence of what we defined as a measurement (or what’s sometimes called a pre-measurement). Taking the tensor product structure and density matrix formalism as a given, if the interesting subsystem starts in a pure state, the unitary measurement structure implies that the reduced state of the interesting subsystem will generally be a mixed state after measurement. You might find parts of this review informative; it covers pre-measurements and also weak measurements, and in particular talks about how to actually implement measurements with an interaction Hamiltonian.
You could also turn around this question. If you find it somewhat plausible that that self-adjoint operators represent physical quantities, eigenvalues represent measurement outcomes and eigenvectors represent states associated with these outcomes (per the arguments I have given in my other post) one could picture a situation where systems hop from eigenvector to eigenvector through time. From this point of view, continuous evolution between states is the strange thing.
The paper by Hardy I cited in another answer to you tries to make QM as similar to a classical probabilistic framework as possible and the sole difference between his two frameworks is that there are continuous transformations between states in the quantum case. (But notice that he works in a finite-dimensional setting which doesn’t easily permit important features of QM like the canonical commutation relations).
Ok, now comes the trick: we assume that observation doesn’t change the system
and this
I think the basic point is that if you start by distinguishing your eigenfunctions, then you naturally get out distinguished eigenfunctions.
doesn’t sound correct to me.
The basis in which the diagonalization happens isn’t put in at the beginning. It is determined by the nature of the interaction between the system and its environment. See “environment-induced superselection” or short “einselection”.
Ok, but OP of the post above starts with “Suppose we have a system S with eigenfunctions {φi}”, so I don’t see why (or how) they should depend on the observer. I’m not claiming these are just arbitrary functions. The point is that requiring the the time-evolution on pure states of the form ψ⊗φi to map to pure states of the same kind is arbitrary choice that distinguishes the eigenfunctions. Why can’t we chose any other orthonormal basis at this point, say some ONB (wi)i, and require that wi⊗ψ↦ESwi⊗ψi, where ψi is defined so that this makes sense and is unitary? (I guess this is what you mean with “diagonalization”, but I dislike the term because if we chose a non-eigenfunction orthonormal basis the construction still “works”, the representation just won’t be diagonal in the first component).
I’m very confused by the mathematical setup. Probably it’s because I’m a mathematician and not a physicist, so I don’t see things that would be clear for a physicists. My knowledge of quantum mechanics is very very basic, but nonzero. Here’s how I rewrote the setup part of your paper as I was going along, I hope I got everything right.
You have a system S which is some (seperable, complex, etc..) Hilbert space. You also have an observer system O (which is also a Hilbert space). Elements of various Hilbert spaces are called “states”. Then you have the joint system S⊗O of which Ψ is an element of, which comes with a (unitary) time-evolution E:S⊗O→S⊗O. Now if S were not being observed, it would evolve by some (unitary) time-evolution ES:S→S. We assume (though I think functional analysis gives this to use for free) that (vi)i is an orthonormal basis of eigenfunctions of ES, with eigenvalues (λi)i.
Ok, now comes the trick: we assume that observation doesn’t change the system, i.e. that the S-component of E is ES. Wait, that doesn’t make sense!E doesn’t have an “S-component”, something like an S-component makes sense only for pure states, if you have mixed states then the idea breaks down. Ok, so we assume that E, when acting on pure states, is equal to ES. So this would give E:φ⊗ψ↦(ESφ)⊗ψφ, where ψφ is defined so that this holds. Presumably something goes wrong if we do this, so we instead require the weaker E:φi⊗ψ↦(ESφi)⊗ψi. And bingo! Since the φi are eigenfunctions, we get that E(φi⊗ψ)=φi⊗(λiψi), and let’s redefine ψi to include the λi term because why not. Now, if we extend by linearity we get that E:ϕ⊗ψ↦∑iaiφi⊗ψi. Applying E again gives ∑iaiφi⊗(ψi)i, and the same for further powers.
Ok, let’s interpret that last part in terms of “observations”. If we take states of the combined system S⊗E, then time-evolution maps pure states with only a vi component to pure states with only a vi component. Wait, that’s exactly what we assumed, why should we be surprised? Well yeah, but if you started out with some linear combination of eigenfunctions, these will be mapped to a linear combination of pure states, and each pure state in this linear combination evolves as assumed, which may or may not be abig deal to you. In a mixed state that is a linear combination of pure states, we call each pure state a “separate observer” or something like this. Of course, mixed states in a tensor product state cannot be uniquely be written as a sum of pure states. However, if we take our preferred basis (vi)i and express our mixed states as pure states with respect to that basis in the S-component, this again makes sense.
So it’s super important that we have already distinguished the eigenfunctions of ES at the start, we unfortunately don’t get them out “naturally”. But I guess we learn something about consistency, in the sense that “if eigenfunctions are important, then eigenfunctions are important”.
Ok, now assume our system S is itself a tensor-product of N subsystems S1⊗⋯⊗SN, which we think of as “repeating a measurement”. Now what we get if we start with some pure-state is (in general) a mixed state which can be written as a linear combination of pure states of eigenfunctions. As the eigenfunctions of the different systems are different (they are elements of different spaces), if you start out with some non-eigenfunction in each subsystem, you’ll end up with some mixed state that contains different eigenfunctions for the different systems. The “derivation of the Born rule” doesn’t need this step with multiple systems. Basically, we can see this already with just one system. If we start with a non-eigenfunction ∑iaiφi, then this gets mapped to some linear combination of pure states via the time-evolution. As the time-evolution is unitary, and the |a_i|^2 sum to 1, we can see that each pure state has “length” |a_i|^2.
Thanks for the great paper! I think I’ve finally understood the Everett interpretation.I think the basic point is that if you start by distinguishing your eigenfunctions, then you naturally get out distinguished eigenfunctions. Which is kind of disappointing, because the fact that eigenfunctions are so important is what I find weirdest about QM. I mean I could accept that the Schrödinger equation gives the evolution of the wave-function, but why care about its eigenfunctions so much?
I’m not sure if this will be satisfying to you but I like to think about it like this:
Experiments show that the order of quantum measurements matters. The mathematical representation of the physical quantities needs to take this into account. One simple kind of non-commutative objects are matrices.
If physical quantities are represented by matrices, the possible measurement outcomes need to be encoded in there somehow. They also need to be real. Both conditions are satisfied by the eigenvalues of self-adjoint matrices.
Experiments show that if we immediately repeat a measurement, we get the same outcome again. So if eigenvalues represent measurement outcomes the state of the system after the measurement must be related to them somehow. The eigenvectors of the matrix representing this state is a simple realization of this.
This isn’t a derivation but it makes the mathematical structure of QM somewhat plausible to me.
Right, but (before reading your post) I had assumed that the eigenvectors somehow “popped out” of the Everett interpretation. But it seems like they are built in from the start. Which is fine, it’s just deeply weird. So it’s kind of hard to say whether the Everett interpretation is more elegant. I mean in the Copenhagen interpretation, you say “measuring can only yield eigenvectors” and the Everett interpretation, you say “measuring can only yield eigenvectors and all measurements are done so the whole thing is still unitary”. But in the end even the Everett interpretation distinguishes “observers” somehow, I mean in the setup you describe there isn’t any reason why we can’t call the “state space” the observer space and the observer “the system being studied” and then write down the same system from the other point of view...
The “symmetric matrices<-> real eigenvectors” is of course important, this is essentially just the spectral theorem which tells us that real linear combinations of orthogonal projections are symmetric matrices (and vice versa).
Nowadays matrices are seen as “simple non-commutative objects”. I’m not sure if this was true when QM was being developed. But then again, I’m not really sure how linear QM “really” is. I mean all of this takes place on vectors with norm 1 (and the results are invariant under change of phase), and once we quotient out the norm, most of the linear structure is gone. I’m not sure what the correct way to think about the phase is. On one hand, it seems like a kind of “fake” unobservable variable and it should be permissible to quotient it out somehow. On the other hand, the complex-ness of the Schrödinger equation seems really important. But is this complexness a red herring? What goes wrong if we just take our “base states” as discrete objects and try to model QM as the evolution of probability distributions over ordered pairs of these states?
This is a bit of a tangent but decoherence isn’t exclusive to the Everett interpretation. Decoherence is itself a measurable physical process independent of the interpretation one favors. So explanations which rely on decoherence are part of all interpretations.
In the derivations of decoherence you make certain approximations which loosely speaking depend on the environment being big relative to the quantum system. If you change the roles these approximations aren’t valid any more. I’m not sure if we are on the same page regarding decoherence, though (see my other reply to your post).
You might be interested in Lucien Hardy’s attempt to find a more intuitive set of axioms for QM compared to the abstractness of the usual presentation: https://arxiv.org/abs/quant-ph/0101012
Isn’t the whole point of the Everett interpretation that there is no decoherence? We have a Hilbert space for the system, and a Hilbert space for the observer, and a unitary evolution on the tensor product space of the system. With these postulates (and a few more), we can start with a pure state and end up with some mixed tensor in the product space, which we then interpret as being “multiple observers”, right? I mean this is how I read your paper.
We are surely not on the same page regarding decoherence, as I know almost nothing about it :)
The arxiv-link looks interesting, I should have a look at it.
Yes, the coherence-based approach (Everett’s original paper, early MWI) is quite different to the decoherence-based approach (Dieter Zeh, post 1970).
Deutsch uses the coherence based approach, while most other many worlders use the decoherence based approach.
He absolutely does establish that quantum computing is superior to classical computing, that underlying reality is not classical, and that the superiority of quantum computing requires some extra structure to reality. What the coherence based approach does not establish is whether the extra structure adds up to something that could be called “alternate worlds” or parallel universes , in the sense familiar from science fiction.
In the coherence based approach, Worlds” are coherent superpositions.That means they in exist at small scales, they can continue to interact with each other, after, “splitting” , and they can be erased. These coherent superposed states are the kind of “world” we have direct evidence for, although they seem to lack many of the properties requited for a fully fledged many worlds theory, hence the scare quotes.
In particular, if you just model the wave function, the only results you will get represent every possible outcome. In order to match observation , you will have to keep discarding unobserved outcomes and renormalising as you do in every interpretation. It’s just that that extra stage is performed manually, not by the programme.
I don’t know if it would make things clearer, but questions about why eigenvectors of Hermitian operators are important can basically be recast as one question of why orthogonal states correspond to mutually exclusive ‘outcomes’. From that starting point, projection-valued measures let you associate real numbers to various orthogonal outcomes, and that’s how you make the operator with the corresponding eigenvectors.
As for why orthogonal states are important in the first place, the natural thing to point to is the unitary dynamics (though there are also various more sophisticated arguments).
Yes, I know all of this, I’m a mathematician, just not one researching QM. The arxiv link looks interesting, but I have no time to read it right now. The question isn’t “why are eigenvectors of Hermitian operators interesting”, it is “why would we expect a system doing something as reasonable as evolving via the Schrödinger equation to do something as unreasonable as to suddenly collapse to one of its eigenfunctions”.
I guess I don’t understand the question. If we accept that mutually exclusive states are represented by orthogonal vectors, and we want to distinguish mutually exclusive states of some interesting subsystem, then what’s unreasonable with defining a “measurement” as something that correlates our apparatus with the orthogonal states of the interesting subsystem, or at least as an ideal form of a measurement?
I think my question isn’t really well-defined. I guess it’s more along the lines of “is there some ‘natural seeming’ reasoning procedure that gets me QM ”.
And it’s even less well-defined as I have no clear understanding of what QM is, as all my attempts to learn it eventually run into problems where something just doesn’t make sense—not because I can’t follow the math, but because I can’t follow the interpretation.
Yes, this makes sense, though “mutually exclusive state are represented by orthogonal vectors” is still really weird. I kind of get why Hermitian operators here makes sense, but then we apply the measurement and the system collapses to one of its eigenfunctions. Why?
If I understand what you mean, this is a consequence of what we defined as a measurement (or what’s sometimes called a pre-measurement). Taking the tensor product structure and density matrix formalism as a given, if the interesting subsystem starts in a pure state, the unitary measurement structure implies that the reduced state of the interesting subsystem will generally be a mixed state after measurement. You might find parts of this review informative; it covers pre-measurements and also weak measurements, and in particular talks about how to actually implement measurements with an interaction Hamiltonian.
You could also turn around this question. If you find it somewhat plausible that that self-adjoint operators represent physical quantities, eigenvalues represent measurement outcomes and eigenvectors represent states associated with these outcomes (per the arguments I have given in my other post) one could picture a situation where systems hop from eigenvector to eigenvector through time. From this point of view, continuous evolution between states is the strange thing.
The paper by Hardy I cited in another answer to you tries to make QM as similar to a classical probabilistic framework as possible and the sole difference between his two frameworks is that there are continuous transformations between states in the quantum case. (But notice that he works in a finite-dimensional setting which doesn’t easily permit important features of QM like the canonical commutation relations).
Well yeah sure. But continuity is a much easier pill to swallow than “continuity only when you aren’t looking”.
This
and this
doesn’t sound correct to me.
The basis in which the diagonalization happens isn’t put in at the beginning. It is determined by the nature of the interaction between the system and its environment. See “environment-induced superselection” or short “einselection”.
Ok, but OP of the post above starts with “Suppose we have a system S with eigenfunctions {φi}”, so I don’t see why (or how) they should depend on the observer. I’m not claiming these are just arbitrary functions. The point is that requiring the the time-evolution on pure states of the form ψ⊗φi to map to pure states of the same kind is arbitrary choice that distinguishes the eigenfunctions. Why can’t we chose any other orthonormal basis at this point, say some ONB (wi)i, and require that wi⊗ψ↦ESwi⊗ψi, where ψi is defined so that this makes sense and is unitary? (I guess this is what you mean with “diagonalization”, but I dislike the term because if we chose a non-eigenfunction orthonormal basis the construction still “works”, the representation just won’t be diagonal in the first component).