Probably a little insane, but there is prior work. Saying “hey, let’s do this but with ambient decision theory” isn’t much of a leap.
I have a more insane idea about under what specific conditions the Born rule isn’t an accurate approximation for decision theoretic purposes (which is sort of like a hypothesis about one possible convergent universal instrumental value (a convergent way superintelligence-instantiations with near-arbitrary initial goal systems to collectively optimize the universe (or a decision policy attractor that superintelligence-instantiations will predictably fall into))), but the margin is too small to explain it and it builds on this other INCREDIBLY AWESOME idea of Steve Rayhawk’s and I don’t want to steal his thunder. (I previously hinted at it as a way to revive the dead even when there’s no information about them left in your light cone and not just with stupid tricks like running through all possible programs. Sounds impossible right? Bwa ha ha. [Edit: Actually I’m not sure if it’s technically still in your light cone or not. I’d have to think. I don’t like thinking.]) Hey Steve, would you mind briefly explaining the reversible computing idea here? Pretty please so I don’t have to keep annoying LW by being all seekrit?
I am going to have to do a proper LW sequence demonstrating why Many Worlds is of little or no interest as a serious theory of physics. Reduced to a slogan: Either you specify what parts of the wavefunction correspond to observable reality, and then you fail to comply with relativity and the Born rule, or else you don’t specify what parts of the wavefunction correspond to observable reality, and then you fail to have a theory. The Deutsch-Wallace approach of obtaining the Born rule from decision theory rather than from actual frequencies of events in the multiverse IMHO is just a hopeless attempt to get around this dilemma, by redefining probability so it’s not about event frequencies.
What are your thoughts on the alternatives? My frustration with the subject as it is discussed on LW is that everyone only speaks about Many Worlds in contrast to Copenhagen. It’s a bit like watching a man beat up a little girl and concluding he is the strongest man in the world. I wish people here would pay more attention to the live alternatives. I find De Broglie-Bohm particularly intriguing. I hope you write the sequence.
These are my rather circumspect thoughts on where the answer will come from.
Bohmian mechanics is ontologically incompatible with special relativity. You still get relativistic effects, but the theory requires a notion of absolute simultaneity in order to be written down, so there is an undetectable preferred reference frame. Still, I have at least two technical reasons (weak values and holographic renormalization group) for thinking it is close to something important, so it’s certainly in my thoughts (when I think about this topic).
I’ll admit that emergent Lorentz invariance is not a completely unreasonable idea; there are many other examples of emergent symmetry. Though I wonder how it looks when you try to obtain emergent diffeomorphism invariance (for general relativity) as well.
Goldstein and Tumulka looks a little artificial. They allow for entanglement but not interaction, and thereby avoid causal loops in time. I’m far more interested in attempts to derive quantum nonlocality from time loops. Possibly their model can be obtained from a time-loop model in the limit where interactions are negligible.
Either you specify what parts of the wavefunction correspond to observable reality, and then you fail to comply with relativity and the Born rule, or else you don’t specify what parts of the wavefunction correspond to observable reality, and then you fail to have a theory.
If you spent as much energy actually trying to understand what MWI actually means as you do trying to argue against it, we wouldn’t have to play Karma Whack-A-Mole every time you start spouting nonsense on here.
Very well; I ask you to exhibit one, just one, example of a wavefunction in which you can (1) specify which parts are the “worlds” / “branches” / whatever-you-call-them (2) explain how it is that those parts are relativistically covariant (3) explain why it is that one branch can have an unequal probability compared to another branch, when they are both equally real.
Let’s call |1> and |0> the two energy states of a two-state system. We can then imagine the system being in the state (3/5) |1> + (4/5) |0>. If the state of the universe except for this system is called |U>, the two “worlds” (or as I like to call them, “dimensions in Hilbert space”) are then |U> x |1> = |U1> and |U> x |0> = |U0>.
Using this example, could you more specifically state your problems?
1) The worlds have been defined with respect to a particular reference frame. What happens to them under a Lorentz boost?
2) Why does |U1> have a probability of .36 and why does |U2> have a probability of .64? In the very small multiverse you have described, they each exist once, so it seems like they should each have a probability of .5.
1) Under a Lorentz boost, the energy levels become closer together to obey time dilation. Not seein’ the problem here.
2) Because we’ve observed it in experiment. Sure, there’s lots of justifications within the framework of quantum mechanics (In collapse speak, if we measure the particle, the probability we collapse to eigeinfunctions is the square of their amplitude. In MW-speak, when we get entangled with the particle, what we’ll see form the inside is a mixed state in the eigenstate-basis with probabilities equal to the square of the amplitudes.), but that framework exists because it’s what matches experiment. A coin weighted on the heads side only has two possibilities too, but it doesn’t have a .5 chance of landing heads or tails.
1) Under a Lorentz boost, the energy levels become closer together to obey time dilation. Not seein’ the problem here.
Aren’t your quantum states defined only on the hypersurfaces of a particular foliation of space-time? That’s the problem. By reifying these states, you also have to reify the hypersurfaces they are defined on.
2) Because we’ve observed it in experiment.
But you haven’t explained how it is that your theory predicts what we observe. You’ve said there are two worlds, “3/5 |U> x |1>” and “4/5 |U> x |0>”. Two worlds, they both exist, one contains |1>, the other contains |0>, seems like |1> and |0> should be equally probable. Instead we observe them with unequal frequency.
Ah, you meant what’s the effect on entangled particles at different locations? I still don’t see that there’s a problem. You just see a different slice of Hilbert space, and Hilbert space is what gets realified (new word) by MW. In fact, I’d say it handles relativity better than a way of thinking that involves lots of collapses—if we’re a light-year apart and each measure an independent particle at the same coordinate time, an objective collapse violates the either the Copernican principle or relativity—we can’t have independent objective collapses.
If you want an explanation of how you get a probabilistic state from an entangled state (“how the theory predicts what we observe”), check out partial traces.
you meant what’s the effect on entangled particles at different locations?
No, I just meant that your Hilbert space is associated with a preferred foliation. The states in the Hilbert space are superpositions of configurations on the slices of that foliation. If you follow Copenhagen, observables are real, wavefunctions are not, and this foliation-dependence of the wavefunctions doesn’t matter. It’s like fixing a gauge, doing your calculation, and then getting gauge-invariant results back for the observables. These results—expectation values, correlation functions… - don’t require any preferred foliation for their definition. The wavefunctions do, but they are just regarded as constructs.
So Copenhagen gets to be consistent with special relativity at the price of being incomplete. Now according to Many Worlds, we can obtain a complete description of physical reality by saying that wavefunctions are real. What I am pointing out is that wavefunctions are defined with respect to a reference frame. Time is not an operator and you need surfaces of simultaneity for Schrodinger evolution. The surface of simultaneity that it lives on is one of the necessary ingredients for defining a wavefunction. If the wavefunction is real, then so is the surface of simultaneity, but the whole point of special relativity is that there is no absolute simultaneity. So how do you, a wavefunction realist, get around this?
If you want an explanation of how you get a probabilistic state from an entangled state (“how the theory predicts what we observe”), check out partial traces.
So, let’s return to your example. The wavefunction of the universe is “|U> x ( 3⁄5 |1> + 4⁄5 |0> )”. Well, this isn’t a great example because the wavefunction factorizes. But anyway, let’s suppose that the reduced density matrix of your two-state system is c_00 |0><1|. You still need to explain how the Born rule makes sense in terms of a multiverse.
Perhaps an analogy will make this clearer. Suppose I’m a car dealer, and you place an order with me for 9 BMWs and 16 Rolls-Royces. Then you come to collect your order, and what you find is one BMW with a “3” painted on it, and one Rolls-Royce with a “4″ painted on it. You complain that I haven’t filled the order, and I say, just square the number painted on each car, and you’ll get what you want. So far as I can see, that’s how MWI works. You work with the same wavefunctions that Copenhagen uses, but you want to do without the Born rule. So instead, you pull out a reduced density matrix, point at the coefficients, and say “you can get your probabilities from those”.
That’s not good enough. If quantum mechanics is to be explained by Many Worlds, I need to get the Born rule frequencies of events from the frequencies with which those events occur in the multiverse. Otherwise I’m just painting a number on a state vector and saying “square it”. If you don’t have some way to decompose that density matrix into parts, so that I actually have 9 instances of |1> and 16 instances of |0>, or some other way to obtain Born frequencies by counting branches, then how can you say that Many Worlds makes the right predictions?
Once you get into field theory you have x, y, z and t all treated as coordinates, not operators. The universe realio trulio starts to look like a 4-dimensional object, and reference frames are just slices of this 4-dimensional object. And I guess you’re right, if you don’t use relativstic quantum mechanics, you won’t have all the nice relativstic properties.
If you want your probabilities to be frequencies, I suppose you could work out the results if you wanted. The run-of-identical-experiment frequencies should actually be pretty easy to calculate, and will give the same answer whether or not you collapse, for obvious mathematical equivalence reasons. And if that’s good enough for you to accept that the outputs of ordinary quantum mechanics “really are” probabilities, maybe it will be good enough for slightly less ordinary quantum mechanics.
A better exercise to explore the unique probabilities in MW might be to show that, if our observer gets totally entangled with a series of two-state systems, the probabilities given by the partial density matrix evolve according to the rules you’d expect from collapse. Note that this isn’t just another boring mathematical equivalence. Humans are interactions between a bunch of multi-state systems. If we evolve in a way that looks like collapse, we’ll see something that looks like collapse!
The universe realio trulio starts to look like a 4-dimensional object, and reference frames are just slices of this 4-dimensional object.
But the quantum wavefunction isn’t a four-dimensional object. If we’re doing field theory, it’s an object in an infinite-dimensional space. The four-dimensionality of field theory resides in the operators, not the wavefunctions. So if I say that the observables corresponding to operators are what’s real, I can think relativistically about space and time, because everything that’s real is always anchored to a specific point in space-time, and the notion of a point doesn’t involve absolute simultaneity. But if I say that the wavefunctions are real, then I have to say that the spacelike hypersurfaces with which they are associated are also real.
If you want your probabilities to be frequencies
What else can they be, in a Many Worlds theory? The whole meaning of Many Worlds is that this is one world among many. There are other worlds and things happen differently there. So if we do the math and add up the frequencies for physical events across all the worlds, we had better find out that ours is a typical sort of world.
Unfortunately, a lot of people who talk about Many Worlds never even think things through this far. They just think “unitary evolution produces decoherence, decoherence diagonalizes a particular basis, observable reality is one of those basis states, therefore the wavefunction of the universe contains observable reality and I don’t need to say any more”. In particular, Many Worlds advocates tend to surreptitiously rely on the Born rule in order to explain the observed frequencies of events. Without something like a Born rule, a partial density matrix is just a mathematical object. If you inspect it, you will not see multiple copies of anything. Instead, you will see an array of numbers. It’s just like the parable of the car dealer. If I am to deliver on your order for 9 BMWs, I have to hand over nine cars, not one car with a number painted on it. Many Worlds fails to deliver on its promise for exactly the same reason.
But the quantum wavefunction isn’t a four-dimensional object.
The four-dimensionality of field theory resides in the operators, not the wavefunctions.
But if I say that the wavefunctions are real, then I have to say that the spacelike hypersurfaces with which they are associated are also real.
I don’t see why any of this is true. You’ll have to unpack more and make it easier to understand, maybe.
If you want your probabilities to be frequencies
What else can they be, in a Many Worlds theory? The whole meaning of Many Worlds is that this is one world among many.
Whoa whoa whoa. No. You should not be putting this much effort if you don’t agree that the “worlds” are a convenient but misleading way to describe it.
I don’t see why any of this is true. You’ll have to unpack more
OK:
the quantum wavefunction isn’t a four-dimensional object.
What is a four-dimensional object? It’s an object which lives in four dimensions. What does that mean? It means its parts can be located in four-dimensional space. If that’s Minkowski space, then we can look at the object from the perspective of various relativity-compliant reference frames.
Now what is a wavefunction? It can only be regarded as four-dimensional in this sense if it’s the wavefunction of a single particle. Once you talk about wavefunctions for multiple particles, or wavefunctionals for quantum fields, it doesn’t have localizable parts. Its constituent amplitudes are at best “multilocal”, e.g. you have amplitudes for a set of n mutually spacelike points.
The four-dimensionality of field theory resides in the operators, not the wavefunctions.
The field operators are indexed by space-time coordinates—they have the form psi(x), where psi is the field, x is a spatial or a space-time position, and psi(x) is an operator which can be applied to a wavefunctional of amplitudes for field configurations. So the operators for a field are four-dimensional (in four-dimensional quantum field theory) because there is a four-dimensional manifold of them. This is because the space-time value of the field is the corresponding observable and the field potentially has a value anywhere in four-dimensional space-time.
But if I say that the wavefunctions are real, then I have to say that the spacelike hypersurfaces with which they are associated are also real.
Wavefunctions, and this especially clear for multiparticle configurations and for fields, are superpositions of configurations defined on some spacelike hypersurface. The hypersurface is part of the definition of the wavefunction, one of the conceptually essential ingredients. So if the wavefunction is real, so is the hypersurface on which it is defined.
You should not be putting this much effort if you don’t agree that the “worlds” are a convenient but misleading way to describe it.
I refer to my dilemma for Many Worlds advocates, as quoted by orthonormal. If you cannot say what the worlds are, then you don’t have a theory. You may think you have a theory, but you don’t, because the worlds (branches, configurations, something) are supposed to be the point of contact between the actually-real wavefunction and observable reality.
One reason I am so strident on this topic is that belief in Many Worlds often seems to be based on half-examined notions that don’t even make sense when you manage to force them into words. The typical non-physicist’s idea of Many Worlds is that it involves many worlds, parallel universes just like in science fiction. The typical physicist’s idea of Many Worlds is more like “wavefunction collapse does not occur”; it’s a “no-collapse” interpretation. But this is the interpretation which is nonsense unless you force it into the mold of the “naive” Many Worlds interpretation, at which point it becomes susceptible to critique and falsification.
The no-collapse interpretation survives in physicists’ minds because of two things: first, Copenhagen tells us that we can get empirically accurate results from wavefunctions; second, doublethink about the meaning of decoherence. No-collapse advocates want to say that decoherence explains how to see observed reality, deep inside the wavefunction, but they won’t take this approach seriously enough to say that the components, aspects, or substructures of the wavefunction that they are pointing to, are really real—because that would be like having a preferred basis. This attitude insulates them from having to address the problems with relativity and the Born rule that people who do talk about worlds face. That’s why I call it doublethink.
If reality is to be found inside the wavefunction by decomposing the local density matrix in the most diagonal basis, then you’re saying that observable reality is one of those local basis states, and you are accountable for explaining why the square of its coefficient corresponds to the frequency with which the corresponding events are actually seen to happen.
Hm, stuff does seem to be more complicated than I’d thought.
Still, saying “and so, collapse happens” doesn’t sit well with me at all, for much-better-understood nonrelativistic QM reasons. Say we’re on opposite sides of a reasonably information-proof wall, and I measure a 2-state system. This is an identical problem to Schrodinger’s cat measuring the decay of an atom—I go into a macroscopic superposition. If you had a 2-state system that was entangled with my 2-state system, you could do a Bell inequality measurement on a signal that I send—even if I send the message manually—and it would show that I really am in this quantum state. On the other hand, from my perspective, when I measure a system I get an answer. So from your perspective I’m in an entangled state, and from my perspective I’ve measured a definite value. How would collapse replicate this sort of subjectivity?
Yet another reason why Copenhagen interpretation, in its true form, does not reify the wavefunction. “Collapse” is just like the update of a prior in the light of new knowledge; you throw away the parts of a probability distribution which are now knowably not relevant. According to Copenhagen, it is the observables that are real, and the wavefunctions are just tabulations of incomplete knowledge. The Copenhagen interpretation only leads you astray if you try to defend the idea that QM according to Copenhagen is a complete theory. But if you’re happy with the idea that QM is incomplete (and thus not the final word in physics), then Copenhagen is your guide. The problem of collapsing wavefunctions is entirely an artefact of belief in wavefunctions. The real problem is simply to explain what’s behind the success of QM, and wavefunction realism is just one possible approach.
It is not my favorite, but an approach which should at least be easy to understand is the “zigzag in time” interpretation, which says that spacelike correlations are due to microscopic time loops. Physics is local, but there are inflection points where forward-in-time causality turns into backwards-in-time causality, and the actual causal web of the universe therefore involves nonlocal-looking regularities. On this view, quantum mechanics is the statistical mechanics of a physics with causal chains running forward and backward in time, and such a physics becomes possible with general relativity.
The first part of this idea—causes operating in both directions of time—is almost as old as quantum mechanics. It’s in the Wheeler-Feynman absorber theory, the transactional interpretation of John Cramer, Yakir Aharonov’s time-symmetric quantum mechanics, and the work of Huw Price, among others; but I prefer the relatively obscure work of Mark Hadley, because he gives it the clearest foundation: the “inflection” in which the time direction of a causal chain reverses, as arising from a non-time-orientable patch in the space-time 4-manifold.
If the microscopic topology of space-time admits such regions, then not only is its evolution in time non-deterministic, but it will be non-deterministic in a complexly correlated way: causal loops in the far future topology constrain what happens on a spacelike hypersurface in the present, in a way that looks highly nonlocal. One manifestation of this would be nonlocally correlated perturbations to the passage of a particle or a wave through space, perturbations correlated not just with each other but also with distant distributions of matter; thus, the effects seen in the double-slit experiment, and all the other standard quantum phenomena.
If this approach worked, it would be very elegant, because it would turn out that quantum mechanics is a microscopic side effect of general relativity. It would require the matter fields to exhibit microscopic violations of the energy conditions which normally prevent wormholes and time machines, but this is not impossible, there are many simple models in which the energy conditions are violated. The challenge would be to show (1) a combination of fields which exhibits those violations and reduces to the standard model (2) that the rules of quantum probability actually do follow from the existence of microscopic time loops. Hadley has an argument that the nondistributive logic of quantum propositions also characterizes the nonlocal constraints arising from time loops, and that this in turn implies the rest of the quantum formalism (e.g. the use of Hilbert space and noncommutative operators for observables); but I believe he needs to actually exhibit some simple solutions to general relativity containing time loops, and show how to obtain the Schrodinger equation from the application of probability theory to such a class of simple solutions, before his argument can be taken seriously.
If this approach worked, it would be very elegant, because it would turn out that quantum mechanics is a microscopic side effect of general relativity. It would require the matter fields to exhibit microscopic violations of the energy conditions which normally prevent wormholes and time machines, but this is not impossible, there are many simple models in which the energy conditions are violated.
Energy conditions (well, the topological censorship, really) in classical GR prevent only traversable wormholes, and only in 3+1 dimensions. Non-simply connected spacetimes are otherwise allowed in a covariant formulation of GR, though they do not arise in an initial value problem with a simply connected spacelike initial surface.
Additionally, changing one’s past is absolutely incompatible with GR, as there is a unique metric tensor associated with each spacetime point, not two or more different ones, one for each go through a closed timelike curve. The only way time travel can happen in GR is by unwrapping these time loops into some universal cover. And there is a heavy price to pay for that, but that discussion is straying too far afield, so feel free to PM me if you want to talk further.
By the way, you’re doing an excellent job of explanation, but I hope you see by now what I meant by “playing Whack-A-Mole”. Every time you make a point, rather than acknowledge it, he’ll just restate his vague objection in more elevated jargon.
2) Why does |U1> have a probability of .36 and why does |U2> have a probability of .64?
Because the wavefunction is, first and foremost, an object in a Hilbert space satisfying an L^2 conservation law, so the only legitimate way to define its “size” or “degree of reality” is the L^2 norm.
the only legitimate way to define its “size” or “degree of reality” is the L^2 norm
“Degree of reality”—an interesting concept, especially when employed as an explanation of why some things happen more often than others. Why does this coin come up heads twice as often as it comes up tails? Because coming up heads has twice the “degree of reality” of coming up tails. Funny, they both felt equally real when they happened…
Face it: if you are going to assert that the observed frequencies of physical events are explained by the existence of Many Worlds, then the frequencies with which those events occur throughout the Many Worlds have to match the observed frequencies. You are going to have to say that the L^2 norm tells you how many copies of a branch exist, not just that a branch has a “size” or a “degree of reality”.
It would be nice if the universe were finite, but you can’t demand that a priori; it’s enough that the infinite mathematical object obeys simple rules.
I’m saying that if we lived in another universe, and someone came along and described to us the wavefunction for the Schrodinger equation, and asked how we should regard the size of some part of the configuration space compared to some other part, the L^2 norm is the blindingly obvious mathematical answer because of the properties of the wavefunction. And so if we (outside the system) were looking for a “typical” instance of a configuration corresponding to a mind, we would weight the configurations by the L^2 norm of the wavefunction.
Because (as it turns out) the wavefunction has a distinguished exceptionally-low-entropy state corresponding to the Big Bang, the configurations where the wavefunction is relatively large encode in various ways the details of (practically unique) intermediate stages between the Big Bang state and the one under consideration: that is, they encode unique histories (1). So a “typical” instance of a configuration containing a mind turns out to be one that places it within a context of a unique and lawful history satisfying the Born probabilities, because the L^2 norm of the wavefunction over the set where they hold to within epsilon is much, much larger than the L^2 norm of the rest. So to the extent that I’m a typical instance of mind-configurations similar to me, I should expect to remember and see evidence of histories satisfying the Born probabilities.
...seriously, I don’t see why people get worked up over this. OK, Eliezer has his infinite-set atheism, and you have your insistence on a naive theory of qualia, but what about everyone else?
(1) This is not a conjecture, it is not controversial, it is something you can prove mathematically about the Schrodinger equation in various contexts.
how we should regard the size of some part of the configuration space compared to some other part, the L^2 norm is the blindingly obvious mathematical answer because of the properties of the wavefunction.
Does “part of the configuration space” refer to a single state vector, or a whole region that a state vector might belong to? My impression is that measuring the latter sort of thing is problematic from a rigorous mathematical standpoint. Is this correct, and does it have consequences for your discussion?
I say the former; people scared of continuous densities might prefer the latter, at which point they have the traditional sorites paradox of how large an epsilon-neighborhood to draw; but in practical terms, this isn’t so bad because (if we start with low entropy) decoherence rapidly separates the wavefunction into thin wisps with almost-zero values taken between them.
Okay, I have tried to understand what sort of ontology could answer to your description. A key consideration: you say we should judge “the size of some part of the configuration space compared to some other part” according to “the L^2 norm of the wavefunction”. You also talk about “mind-configurations similar to me”.
A wavefunction may evolve over time, but configuration space does not. Configuration space is a static arena, and the amplitudes associated with configurations change (unless we’re talking about a timeless wavefunction of the universe; I’ll come to that later). In general, I infer from your discussion that configurations are real—they are the worlds or branches—and the wavefunction determines a measure on configuration space. The measure can’t be identified with the wavefunction—the phase information is lost—so, if we are to treat the wavefunction as also real, we seem to have a dualism remotely similar to Bohmian mechanics: The wavefunction is real, and evolves over time, and there is also a population of configurations—the worlds—whose relative multiplicity changes according to the changing measure.
I want to note one of the peculiarities of this perspective. Since configuration space does not change, and since the different configurations are the worlds, then at every moment in the history of the universe, every possible configuration exists (presumably except for those isolated configurations which have an individual measure of exactly zero). What distinguishes one moment from the next is that there is “more” or “less” of each individual configuration. If we take the use of the mathematical continuum seriously, then it seems that there must be an uncountable number of copies of each configuration at each moment, and the measure is telling us the relative sizes of these uncountable sets.
This scenario might be simplified a little if you had a timeless wavefunction of the universe, if the basic configurations were combinatorial (discrete degrees of freedom rather than continuous), and if amplitudes / probabilities were rational numbers. This would allow your multiverse to consist of a countable number of configurations, each duplicated only finitely often, and without the peculiar phenomenon of all configurations having duplicates at every moment in the history of the universe. This would then land us in a version of Julian Barbour’s Platonia.
There are three features of this analysis that I would emphasize. First, relativity in any space-time sense has disappeared. The worlds are strictly spatial configurations. Second, configurations must be duplicated (whether only finitely often, or uncountably infinitely often), in order for the Born frequencies to be realized. Otherwise, it’s like the parable of the car dealer. Just associating a number with a configuration does not by itself make the events in that configuration occur more frequently. Third, the configurations are distinct from the wavefunction. The wavefunction contains information not contained in the measure, namely the phase relations. So we have a Bohm-like dualism, except, instead of histories guided by a pilot wave, we have disconnected universe-moments whose multiplicities are determined by the Born rule.
There are various ways you could adjust the details of this ontology—which, I emphasize, is an attempt to spell out the ontological commitments implied by what you said. For example, your argument hinged on typicality—being a typical mind-configuration. So maybe, instead of saying that configurations are duplicated, you could simply say that configurations only get to exist if their amplitude is above some nonzero threshold, and then you could argue that Born frequencies are realized inside the individual universe-configuration. This would be a version of Everett’s original idea, I believe. I thought it had largely been abandoned by modern Many Worlds advocates—for example, Robin Hanson dismisses it on the way to introducing his idea of mangled worlds—but I would need to refresh my knowledge of the counterarguments to personally dismiss it.
In any case, you may wish to comment on (1) my assertion that this approach requires dualism of wavefunction and worlds (because the wavefunction can’t be identified with the ensemble of worlds, on account of containing phase information), (2) my assertion that this approach requires world duplication (in order to get the frequencies right), and (3) the way that configuration has supplied a definitely preferred basis in my account. Most Many Worlds people like to avoid a preferred basis, but I don’t see how you can identify the world we actually experience with a wavefunction-part unless you explicitly say that yes, that wavefunction-part has a special status compared to other possible local basis-decompositions. Alternatively, you could assert that several or even all possible basis-decompositions define a “valid” set of worlds, but validity here has to mean existing—so along with the ensemble of spatial configurations, distinct from the wavefunction, you will end up with other ensembles of worlds, corresponding to the basis wavefunctions in other choices of basis, which will also have to be duplicated, etc., in order to produce the right frequencies.
To sum up, my position is that if you do try to deliver on the claims regarding how Many Worlds works, you have to throw out relativity as anything more than a phenomenological fact; you have to have duplication of worlds in order to get the Born frequencies; and the resulting set of worlds can’t be identified with the wavefunction itself, so you end up with a Bohm-like dualism.
A wavefunction may evolve over time, but configuration space does not.
This is probably not true. To really get off the ground with quantum field theory, you have to attach an a priori different Hilbert space of states to each space-like slice of spacetime, and make sense of what equations of motion could mean in this setting—at least this is my limited understanding. I haven’t been following your discussion and I don’t know how it affects the MWI.
you have to attach an a priori different Hilbert space of states to each space-like slice of spacetime
That is a valid formalism but then all the Hilbert spaces are copies of the same Hilbert space, and in the configuration basis, the state vectors are still wavefunctionals over an identical configuration space. The only difference is that the configurations are defined on a different hypersurface; but the field configurations are otherwise the same.
ETA: This comment currently has one downvote and no follow-up, which doesn’t tell me what I got “wrong”. But I will use the occasion to add some more detail.
In perturbative quantum field theory, one of the basic textbook approaches to calculation is the “interaction picture”, a combination of the Schrodinger picture in which the state vector evolves with time and the operators do not, and the Heisenberg picture, in which the state vector is static and the operators evolve with time. In Veltman’s excellent book Diagrammatica, I seem to recall a discussion of the interaction picture, in which it was motivated in terms of different Hilbert spaces. But I could be wrong, he may just have been talking about the S-matrix in general, and I no longer have the book.
The Hilbert spaces of quantum field theory usually only have a nominal existence anyway, because of renormalization. The divergences mean that they are ill-defined; what is well-defined is the renormalization procedure, which is really a calculus of infinite formal series. It is presumed that truly fundamental, “ultraviolet-complete” theories should have properly defined Hilbert spaces, and that the renormalizable field theories are well-defined truncations of unspecified UV-complete theories. But the result is that practical QFT sees a lot of abuse of formalism when judged by mathematical standards.
So it’s impossible to guess whether Sewing-Machine is talking about a way that Hilbert spaces are used in a particular justification of a practical QFT formalism; or perhaps it is the way things are done in one of the attempts to define a mathematically rigorous approach to QFT, such as “algebraic QFT”. These rigorous approaches usually have little to say about the field theories actually used in particle physics, because the latter are renormalizable gauge theories and only exist at that “procedural” level of calculation.
But it should in any case be obvious that the configuration space of field theory in flat space is the same on parallel hypersurfaces. It’s the same fields, the same geometry, the same superposition principle. Anyone who objects is invited to provide a counterargument.
Mathematicians work with toy models of quantum field theories (eg topological QFTs) whose purpose would be entirely defeated if all slices had the same fields on them. For instance, the topology of a slice can change, and mathematicians get pretty excited by theories that can measure such changes. Talking about flat spacetime I suppose such topology changes are irrelevant, and you’re saying moreover that there are no subtler geometric changes that are relevant at all? What if I choose two non-parallel slices?
Quantum field theory is very flexible and can take many forms. In particle physics one mostly cares about quantum fields in flat space—the effects of spatial curvature are nonexistent e.g. in particle collider physics—and this is really the paradigmatic form of QFT as far as a physicist is concerned. There is a lot that can be done with QFT in curved space, but ultimately that takes us towards the fathomless complexities of quantum gravity. I expect that the final answer to the meaning of quantum mechanics lies there, so it is not a topic one can avoid in the long run. But I do not think that adding gravity to the mix simplifies MWI’s problem with relativity, unless you take Julian Barbour’s option and decide to prefer the position basis on the grounds that there is no time evolution in quantum gravity. That is an eccentric combination of views and I think it is a side road to nowhere, on the long journey to the truth. Meanwhile, in the short term, considering the nature of quantum field theory in Minkowski space has the value, that it shows up a common deficiency in Many Worlds thinking.
Non-parallel slices… In general, we are talking about time evolution here. For example, we may consider a wavefunction on an initial hypersurface, and another wavefunction on a final hypersurface, and ask what is the amplitude to go from one to the other. In a Schrodinger picture, you might obtain this amplitude by evolving the initial wavefunction forward in time to the final hypersurface, and then taking the inner product with the final wavefunction, which tells you “how much” (as orthonormal might put it) of the time-evolved wavefunction consists of the desired final wavefunction; what the overlap is.
Non-parallel spacelike hypersurfaces will intersect somewhere, but you could still try to perform a similar calculation. The first difficulty is, how do you extrapolate from the ‘initial’ to the ‘final’ hypersurface? Ordinary time evolution won’t do, because the causal order (which hypersurface comes first) will be different on different sides of the plane of intersection. If I was trying to do this I would resort to path integrals: develop a Green’s function or other propagator-like expression which provides an amplitude for a transition from one exact field configuration on one hypersurface, to a different exact field configuration on the other hypersurface, then express the initial and final wavefunctions in the configuration basis, and integrate the configuration-to-configuration transition amplitudes accordingly. One thing you might notice is that the amplitude for configuration-to-configuration transition, when we talk about configurations on intersecting hypersurfaces, ought to be zero unless the configurations exactly match on the plane of intersection.
It’s sort of an interesting problem mathematically, but it doesn’t seem too relevant to physics. What might be relevant is if you were dealing with finite (bounded) hypersurfaces—so there was no intersection, as would be inevitable in flat space if they were continued to infinity. Instead, you’re just dealing with finite patches of space-time, which have a different spacelike ‘tilt’. Again, the path integral formalism has to be the right way to do it from first principles. It’s really more general than anything involving wavefunctions.
It’s sort of an interesting problem mathematically, but it doesn’t seem too relevant to physics
I disagree, certainly this is where all of the fun stuff happens in classical relativity.
Anyway, I guess I buy your explanation that time evolution identifies the state spaces of two parallel hypersurfaces, but my quarter-educated hunch is that you’ll find it’s not so in general.
I suspect the more fundamental difference of perspective here is about metaphysics. I feel like I can always fall back on “it doesn’t really matter, I’m only talking in terms of physics because talking in terms of simulations causes people to go funny in the head”, but my impression is that you’re skeptical of such naive computationalism? (I don’t think the hard problem has been at all solved and I have a real appreciation for the difference between syntax and semantics—I’m something of a property dualist?---but I still don’t understand what may or may not be your opposing intuitions. (I’m sort of suspicious of SL4 folk I guess, I lump folk like Eliezer and a few others into the “never acclimated to SL5 and got left behind” crowd but only very tentatively.))
(But philosophical natter won’t help with making actual progress, won’t get you anywhere. Having concluded that physical world and the domain of decision theory are fundamentally mathematical, the next step is to master what people know about mathematical thinking, and perhaps physics. Fluency in commonly useful mental tools, just short of becoming specialized in anything in particular in order to complete this stage in reasonable time, like 10 years.)
Sans les mathématiques on ne pénètre point au fond de la philosophie. Sans la philosophie on ne pénètre point au fond des mathématiques. Sans les deux on ne pénètre au fond de rien. — Leibniz [Without mathematics we cannot penetrate deeply into philosophy. Without philosophy we cannot penetrate deeply into mathematics. Without both we cannot penetrate deeply into anything.]
(Perhaps we should taboo “philosophical”. Speculative technical discussion often leads to actual progress. I don’t yet believe in math, but I know that I need to hang out a lot with people who do believe in math if I’m to stay on track, and that’s what I do. (Though not enough.))
Let’s just ignore questions of consciousness entirely and think in terms of decision-making systems, which may or may not be conscious, and which have sensory inputs, some self-knowledge or introspective capacity, a capacity to make causal world-models, etc (all those things can be given a purely functionalist definition).
What then does “talking in terms of simulations” mean? It means that the decision-making system needs to consider, in choosing a world-model, worlds where it (the decision-making system) exists at the physics level—at the lowest possible level of implementation, in a given ontology—and worlds where it exists at a level somewhere above lowest possible—that is, in a simulated physics several layers of abstraction removed from a fundamental physics.
I strongly doubt that you’re going to be able to derive the Born rule by just thinking about a decision theory that worries about whether you’re an nth-level simulation, and doesn’t concern itself too much with the nature of physics at the bottom level. Back on Earth, we didn’t derive the Born rule from any sort of apriori, it was chosen solely on the basis of empirical adequacy. But if you are going to derive it by reasoning about your possible place in an apriori multiverse (think Tegmark level 4), then you simply have to concern yourself with the distribute of possible bottom-level physical ontologies. Even if it turns out that simulations, and simulations of simulations, are frequent enough in the multiverse, that you must give those possibilities significant consideration, I don’t see how you can get to that stage without going through the stage of thinking about bottom-level physical ontologies.
Agreed, you need something like a basement to get a baseline, at the very least a logical basement as a Schelling point. There’s not a non-circular obvious decision theoretic reason why you or why cosmological natural selection would ‘pick’ the squared modulus as a Schelling point. But it’s sort of like property rights; we emerged out of Hobbesian anarchy somehow, and that somehow can be at least partially “explained” with game theory, social psychology, or ecology. Ultimately those all feed into each other, but I wouldn’t consider it fruitless to choose one approach and see how far it takes you. Does this analogy fail in the case of deriving the Born rule?
Deviations from the Born rule should be derived from timelessness-cognizant game theory for multipartite systems in order to find equilibria from first principles.
that would make more sense to me, though I still don’t believe that timeless equilibria have much to do with anything. The relationship between simulatee and simulator is completely asymmetric, the simulatee is at the mercy of the simulator in the Vast majority of cases.
As for the origin of the Born rule itself, I certainly don’t believe it has an origin in terms of multiverse-appropriate decision theory. Quantum mechanics is incomplete, it’s a type of statistical mechanics that arises from some class of more fundamental theory that we haven’t yet identified, and the Born rule—that is, the feature that probabilities come from the product of a complex number with its complex conjugate—specifically results from features of that more fundamental theory; that’s how I think it works.
But doesn’t statistical mechanics also fall out of decision theory? Or are you saying that perspective is not a useful one in that it doesn’t explain the arrow of time? (I’m really tired right now, I apologize if I’m only half-responding to the things you’re actually saying.)
Yup. Bayesian agents aren’t good at thinking about themselves, and if you can’t think about yourself you’re in trouble when someone starts offering you bets. I feel like there must be a way in which the whole thing is ironic in a philosophically deep way but I can’t quite put my finger on it.
Basically there is ontology that reifies decision theory as fundamental and reasons about everything in terms of it. It’s a powerful ontology, and often it is a beautiful ontology. Even better it’s still inchoate and so it’s not yet as beautiful as it someday will be.
Probably a little insane, but there is prior work. Saying “hey, let’s do this but with ambient decision theory” isn’t much of a leap.
I have a more insane idea about under what specific conditions the Born rule isn’t an accurate approximation for decision theoretic purposes (which is sort of like a hypothesis about one possible convergent universal instrumental value (a convergent way superintelligence-instantiations with near-arbitrary initial goal systems to collectively optimize the universe (or a decision policy attractor that superintelligence-instantiations will predictably fall into))), but the margin is too small to explain it and it builds on this other INCREDIBLY AWESOME idea of Steve Rayhawk’s and I don’t want to steal his thunder. (I previously hinted at it as a way to revive the dead even when there’s no information about them left in your light cone and not just with stupid tricks like running through all possible programs. Sounds impossible right? Bwa ha ha. [Edit: Actually I’m not sure if it’s technically still in your light cone or not. I’d have to think. I don’t like thinking.]) Hey Steve, would you mind briefly explaining the reversible computing idea here? Pretty please so I don’t have to keep annoying LW by being all seekrit?
I am going to have to do a proper LW sequence demonstrating why Many Worlds is of little or no interest as a serious theory of physics. Reduced to a slogan: Either you specify what parts of the wavefunction correspond to observable reality, and then you fail to comply with relativity and the Born rule, or else you don’t specify what parts of the wavefunction correspond to observable reality, and then you fail to have a theory. The Deutsch-Wallace approach of obtaining the Born rule from decision theory rather than from actual frequencies of events in the multiverse IMHO is just a hopeless attempt to get around this dilemma, by redefining probability so it’s not about event frequencies.
What are your thoughts on the alternatives? My frustration with the subject as it is discussed on LW is that everyone only speaks about Many Worlds in contrast to Copenhagen. It’s a bit like watching a man beat up a little girl and concluding he is the strongest man in the world. I wish people here would pay more attention to the live alternatives. I find De Broglie-Bohm particularly intriguing. I hope you write the sequence.
These are my rather circumspect thoughts on where the answer will come from.
Bohmian mechanics is ontologically incompatible with special relativity. You still get relativistic effects, but the theory requires a notion of absolute simultaneity in order to be written down, so there is an undetectable preferred reference frame. Still, I have at least two technical reasons (weak values and holographic renormalization group) for thinking it is close to something important, so it’s certainly in my thoughts (when I think about this topic).
Thoughts on this?
I’ll admit that emergent Lorentz invariance is not a completely unreasonable idea; there are many other examples of emergent symmetry. Though I wonder how it looks when you try to obtain emergent diffeomorphism invariance (for general relativity) as well.
Goldstein and Tumulka looks a little artificial. They allow for entanglement but not interaction, and thereby avoid causal loops in time. I’m far more interested in attempts to derive quantum nonlocality from time loops. Possibly their model can be obtained from a time-loop model in the limit where interactions are negligible.
I agree with Jack, I’d appreciate a sequence, especially one that touched on interpretations that do weird things with timelike curves.
If you spent as much energy actually trying to understand what MWI actually means as you do trying to argue against it, we wouldn’t have to play Karma Whack-A-Mole every time you start spouting nonsense on here.
Very well; I ask you to exhibit one, just one, example of a wavefunction in which you can (1) specify which parts are the “worlds” / “branches” / whatever-you-call-them (2) explain how it is that those parts are relativistically covariant (3) explain why it is that one branch can have an unequal probability compared to another branch, when they are both equally real.
Let’s call |1> and |0> the two energy states of a two-state system. We can then imagine the system being in the state (3/5) |1> + (4/5) |0>. If the state of the universe except for this system is called |U>, the two “worlds” (or as I like to call them, “dimensions in Hilbert space”) are then |U> x |1> = |U1> and |U> x |0> = |U0>.
Using this example, could you more specifically state your problems?
1) The worlds have been defined with respect to a particular reference frame. What happens to them under a Lorentz boost?
2) Why does |U1> have a probability of .36 and why does |U2> have a probability of .64? In the very small multiverse you have described, they each exist once, so it seems like they should each have a probability of .5.
1) Under a Lorentz boost, the energy levels become closer together to obey time dilation. Not seein’ the problem here.
2) Because we’ve observed it in experiment. Sure, there’s lots of justifications within the framework of quantum mechanics (In collapse speak, if we measure the particle, the probability we collapse to eigeinfunctions is the square of their amplitude. In MW-speak, when we get entangled with the particle, what we’ll see form the inside is a mixed state in the eigenstate-basis with probabilities equal to the square of the amplitudes.), but that framework exists because it’s what matches experiment. A coin weighted on the heads side only has two possibilities too, but it doesn’t have a .5 chance of landing heads or tails.
Aren’t your quantum states defined only on the hypersurfaces of a particular foliation of space-time? That’s the problem. By reifying these states, you also have to reify the hypersurfaces they are defined on.
But you haven’t explained how it is that your theory predicts what we observe. You’ve said there are two worlds, “3/5 |U> x |1>” and “4/5 |U> x |0>”. Two worlds, they both exist, one contains |1>, the other contains |0>, seems like |1> and |0> should be equally probable. Instead we observe them with unequal frequency.
Ah, you meant what’s the effect on entangled particles at different locations? I still don’t see that there’s a problem. You just see a different slice of Hilbert space, and Hilbert space is what gets realified (new word) by MW. In fact, I’d say it handles relativity better than a way of thinking that involves lots of collapses—if we’re a light-year apart and each measure an independent particle at the same coordinate time, an objective collapse violates the either the Copernican principle or relativity—we can’t have independent objective collapses.
If you want an explanation of how you get a probabilistic state from an entangled state (“how the theory predicts what we observe”), check out partial traces.
No, I just meant that your Hilbert space is associated with a preferred foliation. The states in the Hilbert space are superpositions of configurations on the slices of that foliation. If you follow Copenhagen, observables are real, wavefunctions are not, and this foliation-dependence of the wavefunctions doesn’t matter. It’s like fixing a gauge, doing your calculation, and then getting gauge-invariant results back for the observables. These results—expectation values, correlation functions… - don’t require any preferred foliation for their definition. The wavefunctions do, but they are just regarded as constructs.
So Copenhagen gets to be consistent with special relativity at the price of being incomplete. Now according to Many Worlds, we can obtain a complete description of physical reality by saying that wavefunctions are real. What I am pointing out is that wavefunctions are defined with respect to a reference frame. Time is not an operator and you need surfaces of simultaneity for Schrodinger evolution. The surface of simultaneity that it lives on is one of the necessary ingredients for defining a wavefunction. If the wavefunction is real, then so is the surface of simultaneity, but the whole point of special relativity is that there is no absolute simultaneity. So how do you, a wavefunction realist, get around this?
So, let’s return to your example. The wavefunction of the universe is “|U> x ( 3⁄5 |1> + 4⁄5 |0> )”. Well, this isn’t a great example because the wavefunction factorizes. But anyway, let’s suppose that the reduced density matrix of your two-state system is c_00 |0><1|. You still need to explain how the Born rule makes sense in terms of a multiverse.
Perhaps an analogy will make this clearer. Suppose I’m a car dealer, and you place an order with me for 9 BMWs and 16 Rolls-Royces. Then you come to collect your order, and what you find is one BMW with a “3” painted on it, and one Rolls-Royce with a “4″ painted on it. You complain that I haven’t filled the order, and I say, just square the number painted on each car, and you’ll get what you want. So far as I can see, that’s how MWI works. You work with the same wavefunctions that Copenhagen uses, but you want to do without the Born rule. So instead, you pull out a reduced density matrix, point at the coefficients, and say “you can get your probabilities from those”.
That’s not good enough. If quantum mechanics is to be explained by Many Worlds, I need to get the Born rule frequencies of events from the frequencies with which those events occur in the multiverse. Otherwise I’m just painting a number on a state vector and saying “square it”. If you don’t have some way to decompose that density matrix into parts, so that I actually have 9 instances of |1> and 16 instances of |0>, or some other way to obtain Born frequencies by counting branches, then how can you say that Many Worlds makes the right predictions?
Once you get into field theory you have x, y, z and t all treated as coordinates, not operators. The universe realio trulio starts to look like a 4-dimensional object, and reference frames are just slices of this 4-dimensional object. And I guess you’re right, if you don’t use relativstic quantum mechanics, you won’t have all the nice relativstic properties.
If you want your probabilities to be frequencies, I suppose you could work out the results if you wanted. The run-of-identical-experiment frequencies should actually be pretty easy to calculate, and will give the same answer whether or not you collapse, for obvious mathematical equivalence reasons. And if that’s good enough for you to accept that the outputs of ordinary quantum mechanics “really are” probabilities, maybe it will be good enough for slightly less ordinary quantum mechanics.
A better exercise to explore the unique probabilities in MW might be to show that, if our observer gets totally entangled with a series of two-state systems, the probabilities given by the partial density matrix evolve according to the rules you’d expect from collapse. Note that this isn’t just another boring mathematical equivalence. Humans are interactions between a bunch of multi-state systems. If we evolve in a way that looks like collapse, we’ll see something that looks like collapse!
But the quantum wavefunction isn’t a four-dimensional object. If we’re doing field theory, it’s an object in an infinite-dimensional space. The four-dimensionality of field theory resides in the operators, not the wavefunctions. So if I say that the observables corresponding to operators are what’s real, I can think relativistically about space and time, because everything that’s real is always anchored to a specific point in space-time, and the notion of a point doesn’t involve absolute simultaneity. But if I say that the wavefunctions are real, then I have to say that the spacelike hypersurfaces with which they are associated are also real.
What else can they be, in a Many Worlds theory? The whole meaning of Many Worlds is that this is one world among many. There are other worlds and things happen differently there. So if we do the math and add up the frequencies for physical events across all the worlds, we had better find out that ours is a typical sort of world.
Unfortunately, a lot of people who talk about Many Worlds never even think things through this far. They just think “unitary evolution produces decoherence, decoherence diagonalizes a particular basis, observable reality is one of those basis states, therefore the wavefunction of the universe contains observable reality and I don’t need to say any more”. In particular, Many Worlds advocates tend to surreptitiously rely on the Born rule in order to explain the observed frequencies of events. Without something like a Born rule, a partial density matrix is just a mathematical object. If you inspect it, you will not see multiple copies of anything. Instead, you will see an array of numbers. It’s just like the parable of the car dealer. If I am to deliver on your order for 9 BMWs, I have to hand over nine cars, not one car with a number painted on it. Many Worlds fails to deliver on its promise for exactly the same reason.
I don’t see why any of this is true. You’ll have to unpack more and make it easier to understand, maybe.
Whoa whoa whoa. No. You should not be putting this much effort if you don’t agree that the “worlds” are a convenient but misleading way to describe it.
OK:
What is a four-dimensional object? It’s an object which lives in four dimensions. What does that mean? It means its parts can be located in four-dimensional space. If that’s Minkowski space, then we can look at the object from the perspective of various relativity-compliant reference frames.
Now what is a wavefunction? It can only be regarded as four-dimensional in this sense if it’s the wavefunction of a single particle. Once you talk about wavefunctions for multiple particles, or wavefunctionals for quantum fields, it doesn’t have localizable parts. Its constituent amplitudes are at best “multilocal”, e.g. you have amplitudes for a set of n mutually spacelike points.
The field operators are indexed by space-time coordinates—they have the form psi(x), where psi is the field, x is a spatial or a space-time position, and psi(x) is an operator which can be applied to a wavefunctional of amplitudes for field configurations. So the operators for a field are four-dimensional (in four-dimensional quantum field theory) because there is a four-dimensional manifold of them. This is because the space-time value of the field is the corresponding observable and the field potentially has a value anywhere in four-dimensional space-time.
Wavefunctions, and this especially clear for multiparticle configurations and for fields, are superpositions of configurations defined on some spacelike hypersurface. The hypersurface is part of the definition of the wavefunction, one of the conceptually essential ingredients. So if the wavefunction is real, so is the hypersurface on which it is defined.
I refer to my dilemma for Many Worlds advocates, as quoted by orthonormal. If you cannot say what the worlds are, then you don’t have a theory. You may think you have a theory, but you don’t, because the worlds (branches, configurations, something) are supposed to be the point of contact between the actually-real wavefunction and observable reality.
One reason I am so strident on this topic is that belief in Many Worlds often seems to be based on half-examined notions that don’t even make sense when you manage to force them into words. The typical non-physicist’s idea of Many Worlds is that it involves many worlds, parallel universes just like in science fiction. The typical physicist’s idea of Many Worlds is more like “wavefunction collapse does not occur”; it’s a “no-collapse” interpretation. But this is the interpretation which is nonsense unless you force it into the mold of the “naive” Many Worlds interpretation, at which point it becomes susceptible to critique and falsification.
The no-collapse interpretation survives in physicists’ minds because of two things: first, Copenhagen tells us that we can get empirically accurate results from wavefunctions; second, doublethink about the meaning of decoherence. No-collapse advocates want to say that decoherence explains how to see observed reality, deep inside the wavefunction, but they won’t take this approach seriously enough to say that the components, aspects, or substructures of the wavefunction that they are pointing to, are really real—because that would be like having a preferred basis. This attitude insulates them from having to address the problems with relativity and the Born rule that people who do talk about worlds face. That’s why I call it doublethink.
If reality is to be found inside the wavefunction by decomposing the local density matrix in the most diagonal basis, then you’re saying that observable reality is one of those local basis states, and you are accountable for explaining why the square of its coefficient corresponds to the frequency with which the corresponding events are actually seen to happen.
Hm, stuff does seem to be more complicated than I’d thought.
Still, saying “and so, collapse happens” doesn’t sit well with me at all, for much-better-understood nonrelativistic QM reasons. Say we’re on opposite sides of a reasonably information-proof wall, and I measure a 2-state system. This is an identical problem to Schrodinger’s cat measuring the decay of an atom—I go into a macroscopic superposition. If you had a 2-state system that was entangled with my 2-state system, you could do a Bell inequality measurement on a signal that I send—even if I send the message manually—and it would show that I really am in this quantum state. On the other hand, from my perspective, when I measure a system I get an answer. So from your perspective I’m in an entangled state, and from my perspective I’ve measured a definite value. How would collapse replicate this sort of subjectivity?
Yet another reason why Copenhagen interpretation, in its true form, does not reify the wavefunction. “Collapse” is just like the update of a prior in the light of new knowledge; you throw away the parts of a probability distribution which are now knowably not relevant. According to Copenhagen, it is the observables that are real, and the wavefunctions are just tabulations of incomplete knowledge. The Copenhagen interpretation only leads you astray if you try to defend the idea that QM according to Copenhagen is a complete theory. But if you’re happy with the idea that QM is incomplete (and thus not the final word in physics), then Copenhagen is your guide. The problem of collapsing wavefunctions is entirely an artefact of belief in wavefunctions. The real problem is simply to explain what’s behind the success of QM, and wavefunction realism is just one possible approach.
Okay. So how would your favored interpretation handle that sort of subjectivity?
It is not my favorite, but an approach which should at least be easy to understand is the “zigzag in time” interpretation, which says that spacelike correlations are due to microscopic time loops. Physics is local, but there are inflection points where forward-in-time causality turns into backwards-in-time causality, and the actual causal web of the universe therefore involves nonlocal-looking regularities. On this view, quantum mechanics is the statistical mechanics of a physics with causal chains running forward and backward in time, and such a physics becomes possible with general relativity.
The first part of this idea—causes operating in both directions of time—is almost as old as quantum mechanics. It’s in the Wheeler-Feynman absorber theory, the transactional interpretation of John Cramer, Yakir Aharonov’s time-symmetric quantum mechanics, and the work of Huw Price, among others; but I prefer the relatively obscure work of Mark Hadley, because he gives it the clearest foundation: the “inflection” in which the time direction of a causal chain reverses, as arising from a non-time-orientable patch in the space-time 4-manifold.
If the microscopic topology of space-time admits such regions, then not only is its evolution in time non-deterministic, but it will be non-deterministic in a complexly correlated way: causal loops in the far future topology constrain what happens on a spacelike hypersurface in the present, in a way that looks highly nonlocal. One manifestation of this would be nonlocally correlated perturbations to the passage of a particle or a wave through space, perturbations correlated not just with each other but also with distant distributions of matter; thus, the effects seen in the double-slit experiment, and all the other standard quantum phenomena.
If this approach worked, it would be very elegant, because it would turn out that quantum mechanics is a microscopic side effect of general relativity. It would require the matter fields to exhibit microscopic violations of the energy conditions which normally prevent wormholes and time machines, but this is not impossible, there are many simple models in which the energy conditions are violated. The challenge would be to show (1) a combination of fields which exhibits those violations and reduces to the standard model (2) that the rules of quantum probability actually do follow from the existence of microscopic time loops. Hadley has an argument that the nondistributive logic of quantum propositions also characterizes the nonlocal constraints arising from time loops, and that this in turn implies the rest of the quantum formalism (e.g. the use of Hilbert space and noncommutative operators for observables); but I believe he needs to actually exhibit some simple solutions to general relativity containing time loops, and show how to obtain the Schrodinger equation from the application of probability theory to such a class of simple solutions, before his argument can be taken seriously.
Energy conditions (well, the topological censorship, really) in classical GR prevent only traversable wormholes, and only in 3+1 dimensions. Non-simply connected spacetimes are otherwise allowed in a covariant formulation of GR, though they do not arise in an initial value problem with a simply connected spacelike initial surface.
Additionally, changing one’s past is absolutely incompatible with GR, as there is a unique metric tensor associated with each spacetime point, not two or more different ones, one for each go through a closed timelike curve. The only way time travel can happen in GR is by unwrapping these time loops into some universal cover. And there is a heavy price to pay for that, but that discussion is straying too far afield, so feel free to PM me if you want to talk further.
By the way, you’re doing an excellent job of explanation, but I hope you see by now what I meant by “playing Whack-A-Mole”. Every time you make a point, rather than acknowledge it, he’ll just restate his vague objection in more elevated jargon.
Because the wavefunction is, first and foremost, an object in a Hilbert space satisfying an L^2 conservation law, so the only legitimate way to define its “size” or “degree of reality” is the L^2 norm.
“Degree of reality”—an interesting concept, especially when employed as an explanation of why some things happen more often than others. Why does this coin come up heads twice as often as it comes up tails? Because coming up heads has twice the “degree of reality” of coming up tails. Funny, they both felt equally real when they happened…
Face it: if you are going to assert that the observed frequencies of physical events are explained by the existence of Many Worlds, then the frequencies with which those events occur throughout the Many Worlds have to match the observed frequencies. You are going to have to say that the L^2 norm tells you how many copies of a branch exist, not just that a branch has a “size” or a “degree of reality”.
It would be nice if the universe were finite, but you can’t demand that a priori; it’s enough that the infinite mathematical object obeys simple rules.
I’m saying that if we lived in another universe, and someone came along and described to us the wavefunction for the Schrodinger equation, and asked how we should regard the size of some part of the configuration space compared to some other part, the L^2 norm is the blindingly obvious mathematical answer because of the properties of the wavefunction. And so if we (outside the system) were looking for a “typical” instance of a configuration corresponding to a mind, we would weight the configurations by the L^2 norm of the wavefunction.
Because (as it turns out) the wavefunction has a distinguished exceptionally-low-entropy state corresponding to the Big Bang, the configurations where the wavefunction is relatively large encode in various ways the details of (practically unique) intermediate stages between the Big Bang state and the one under consideration: that is, they encode unique histories (1). So a “typical” instance of a configuration containing a mind turns out to be one that places it within a context of a unique and lawful history satisfying the Born probabilities, because the L^2 norm of the wavefunction over the set where they hold to within epsilon is much, much larger than the L^2 norm of the rest. So to the extent that I’m a typical instance of mind-configurations similar to me, I should expect to remember and see evidence of histories satisfying the Born probabilities.
...seriously, I don’t see why people get worked up over this. OK, Eliezer has his infinite-set atheism, and you have your insistence on a naive theory of qualia, but what about everyone else?
(1) This is not a conjecture, it is not controversial, it is something you can prove mathematically about the Schrodinger equation in various contexts.
Does “part of the configuration space” refer to a single state vector, or a whole region that a state vector might belong to? My impression is that measuring the latter sort of thing is problematic from a rigorous mathematical standpoint. Is this correct, and does it have consequences for your discussion?
I say the former; people scared of continuous densities might prefer the latter, at which point they have the traditional sorites paradox of how large an epsilon-neighborhood to draw; but in practical terms, this isn’t so bad because (if we start with low entropy) decoherence rapidly separates the wavefunction into thin wisps with almost-zero values taken between them.
Okay, I have tried to understand what sort of ontology could answer to your description. A key consideration: you say we should judge “the size of some part of the configuration space compared to some other part” according to “the L^2 norm of the wavefunction”. You also talk about “mind-configurations similar to me”.
A wavefunction may evolve over time, but configuration space does not. Configuration space is a static arena, and the amplitudes associated with configurations change (unless we’re talking about a timeless wavefunction of the universe; I’ll come to that later). In general, I infer from your discussion that configurations are real—they are the worlds or branches—and the wavefunction determines a measure on configuration space. The measure can’t be identified with the wavefunction—the phase information is lost—so, if we are to treat the wavefunction as also real, we seem to have a dualism remotely similar to Bohmian mechanics: The wavefunction is real, and evolves over time, and there is also a population of configurations—the worlds—whose relative multiplicity changes according to the changing measure.
I want to note one of the peculiarities of this perspective. Since configuration space does not change, and since the different configurations are the worlds, then at every moment in the history of the universe, every possible configuration exists (presumably except for those isolated configurations which have an individual measure of exactly zero). What distinguishes one moment from the next is that there is “more” or “less” of each individual configuration. If we take the use of the mathematical continuum seriously, then it seems that there must be an uncountable number of copies of each configuration at each moment, and the measure is telling us the relative sizes of these uncountable sets.
This scenario might be simplified a little if you had a timeless wavefunction of the universe, if the basic configurations were combinatorial (discrete degrees of freedom rather than continuous), and if amplitudes / probabilities were rational numbers. This would allow your multiverse to consist of a countable number of configurations, each duplicated only finitely often, and without the peculiar phenomenon of all configurations having duplicates at every moment in the history of the universe. This would then land us in a version of Julian Barbour’s Platonia.
There are three features of this analysis that I would emphasize. First, relativity in any space-time sense has disappeared. The worlds are strictly spatial configurations. Second, configurations must be duplicated (whether only finitely often, or uncountably infinitely often), in order for the Born frequencies to be realized. Otherwise, it’s like the parable of the car dealer. Just associating a number with a configuration does not by itself make the events in that configuration occur more frequently. Third, the configurations are distinct from the wavefunction. The wavefunction contains information not contained in the measure, namely the phase relations. So we have a Bohm-like dualism, except, instead of histories guided by a pilot wave, we have disconnected universe-moments whose multiplicities are determined by the Born rule.
There are various ways you could adjust the details of this ontology—which, I emphasize, is an attempt to spell out the ontological commitments implied by what you said. For example, your argument hinged on typicality—being a typical mind-configuration. So maybe, instead of saying that configurations are duplicated, you could simply say that configurations only get to exist if their amplitude is above some nonzero threshold, and then you could argue that Born frequencies are realized inside the individual universe-configuration. This would be a version of Everett’s original idea, I believe. I thought it had largely been abandoned by modern Many Worlds advocates—for example, Robin Hanson dismisses it on the way to introducing his idea of mangled worlds—but I would need to refresh my knowledge of the counterarguments to personally dismiss it.
In any case, you may wish to comment on (1) my assertion that this approach requires dualism of wavefunction and worlds (because the wavefunction can’t be identified with the ensemble of worlds, on account of containing phase information), (2) my assertion that this approach requires world duplication (in order to get the frequencies right), and (3) the way that configuration has supplied a definitely preferred basis in my account. Most Many Worlds people like to avoid a preferred basis, but I don’t see how you can identify the world we actually experience with a wavefunction-part unless you explicitly say that yes, that wavefunction-part has a special status compared to other possible local basis-decompositions. Alternatively, you could assert that several or even all possible basis-decompositions define a “valid” set of worlds, but validity here has to mean existing—so along with the ensemble of spatial configurations, distinct from the wavefunction, you will end up with other ensembles of worlds, corresponding to the basis wavefunctions in other choices of basis, which will also have to be duplicated, etc., in order to produce the right frequencies.
To sum up, my position is that if you do try to deliver on the claims regarding how Many Worlds works, you have to throw out relativity as anything more than a phenomenological fact; you have to have duplication of worlds in order to get the Born frequencies; and the resulting set of worlds can’t be identified with the wavefunction itself, so you end up with a Bohm-like dualism.
This is probably not true. To really get off the ground with quantum field theory, you have to attach an a priori different Hilbert space of states to each space-like slice of spacetime, and make sense of what equations of motion could mean in this setting—at least this is my limited understanding. I haven’t been following your discussion and I don’t know how it affects the MWI.
That is a valid formalism but then all the Hilbert spaces are copies of the same Hilbert space, and in the configuration basis, the state vectors are still wavefunctionals over an identical configuration space. The only difference is that the configurations are defined on a different hypersurface; but the field configurations are otherwise the same.
ETA: This comment currently has one downvote and no follow-up, which doesn’t tell me what I got “wrong”. But I will use the occasion to add some more detail.
In perturbative quantum field theory, one of the basic textbook approaches to calculation is the “interaction picture”, a combination of the Schrodinger picture in which the state vector evolves with time and the operators do not, and the Heisenberg picture, in which the state vector is static and the operators evolve with time. In Veltman’s excellent book Diagrammatica, I seem to recall a discussion of the interaction picture, in which it was motivated in terms of different Hilbert spaces. But I could be wrong, he may just have been talking about the S-matrix in general, and I no longer have the book.
The Hilbert spaces of quantum field theory usually only have a nominal existence anyway, because of renormalization. The divergences mean that they are ill-defined; what is well-defined is the renormalization procedure, which is really a calculus of infinite formal series. It is presumed that truly fundamental, “ultraviolet-complete” theories should have properly defined Hilbert spaces, and that the renormalizable field theories are well-defined truncations of unspecified UV-complete theories. But the result is that practical QFT sees a lot of abuse of formalism when judged by mathematical standards.
So it’s impossible to guess whether Sewing-Machine is talking about a way that Hilbert spaces are used in a particular justification of a practical QFT formalism; or perhaps it is the way things are done in one of the attempts to define a mathematically rigorous approach to QFT, such as “algebraic QFT”. These rigorous approaches usually have little to say about the field theories actually used in particle physics, because the latter are renormalizable gauge theories and only exist at that “procedural” level of calculation.
But it should in any case be obvious that the configuration space of field theory in flat space is the same on parallel hypersurfaces. It’s the same fields, the same geometry, the same superposition principle. Anyone who objects is invited to provide a counterargument.
Mathematicians work with toy models of quantum field theories (eg topological QFTs) whose purpose would be entirely defeated if all slices had the same fields on them. For instance, the topology of a slice can change, and mathematicians get pretty excited by theories that can measure such changes. Talking about flat spacetime I suppose such topology changes are irrelevant, and you’re saying moreover that there are no subtler geometric changes that are relevant at all? What if I choose two non-parallel slices?
Quantum field theory is very flexible and can take many forms. In particle physics one mostly cares about quantum fields in flat space—the effects of spatial curvature are nonexistent e.g. in particle collider physics—and this is really the paradigmatic form of QFT as far as a physicist is concerned. There is a lot that can be done with QFT in curved space, but ultimately that takes us towards the fathomless complexities of quantum gravity. I expect that the final answer to the meaning of quantum mechanics lies there, so it is not a topic one can avoid in the long run. But I do not think that adding gravity to the mix simplifies MWI’s problem with relativity, unless you take Julian Barbour’s option and decide to prefer the position basis on the grounds that there is no time evolution in quantum gravity. That is an eccentric combination of views and I think it is a side road to nowhere, on the long journey to the truth. Meanwhile, in the short term, considering the nature of quantum field theory in Minkowski space has the value, that it shows up a common deficiency in Many Worlds thinking.
Non-parallel slices… In general, we are talking about time evolution here. For example, we may consider a wavefunction on an initial hypersurface, and another wavefunction on a final hypersurface, and ask what is the amplitude to go from one to the other. In a Schrodinger picture, you might obtain this amplitude by evolving the initial wavefunction forward in time to the final hypersurface, and then taking the inner product with the final wavefunction, which tells you “how much” (as orthonormal might put it) of the time-evolved wavefunction consists of the desired final wavefunction; what the overlap is.
Non-parallel spacelike hypersurfaces will intersect somewhere, but you could still try to perform a similar calculation. The first difficulty is, how do you extrapolate from the ‘initial’ to the ‘final’ hypersurface? Ordinary time evolution won’t do, because the causal order (which hypersurface comes first) will be different on different sides of the plane of intersection. If I was trying to do this I would resort to path integrals: develop a Green’s function or other propagator-like expression which provides an amplitude for a transition from one exact field configuration on one hypersurface, to a different exact field configuration on the other hypersurface, then express the initial and final wavefunctions in the configuration basis, and integrate the configuration-to-configuration transition amplitudes accordingly. One thing you might notice is that the amplitude for configuration-to-configuration transition, when we talk about configurations on intersecting hypersurfaces, ought to be zero unless the configurations exactly match on the plane of intersection.
It’s sort of an interesting problem mathematically, but it doesn’t seem too relevant to physics. What might be relevant is if you were dealing with finite (bounded) hypersurfaces—so there was no intersection, as would be inevitable in flat space if they were continued to infinity. Instead, you’re just dealing with finite patches of space-time, which have a different spacelike ‘tilt’. Again, the path integral formalism has to be the right way to do it from first principles. It’s really more general than anything involving wavefunctions.
I disagree, certainly this is where all of the fun stuff happens in classical relativity.
Anyway, I guess I buy your explanation that time evolution identifies the state spaces of two parallel hypersurfaces, but my quarter-educated hunch is that you’ll find it’s not so in general.
I suspect the more fundamental difference of perspective here is about metaphysics. I feel like I can always fall back on “it doesn’t really matter, I’m only talking in terms of physics because talking in terms of simulations causes people to go funny in the head”, but my impression is that you’re skeptical of such naive computationalism? (I don’t think the hard problem has been at all solved and I have a real appreciation for the difference between syntax and semantics—I’m something of a property dualist?---but I still don’t understand what may or may not be your opposing intuitions. (I’m sort of suspicious of SL4 folk I guess, I lump folk like Eliezer and a few others into the “never acclimated to SL5 and got left behind” crowd but only very tentatively.))
(But philosophical natter won’t help with making actual progress, won’t get you anywhere. Having concluded that physical world and the domain of decision theory are fundamentally mathematical, the next step is to master what people know about mathematical thinking, and perhaps physics. Fluency in commonly useful mental tools, just short of becoming specialized in anything in particular in order to complete this stage in reasonable time, like 10 years.)
(From Chaitin’s home page:
)
What did “philosophie” mean in Leibniz’s time? (For Newton, e.g., “natural philosophy” was the usual term for what we now call “physics”.)
(Perhaps we should taboo “philosophical”. Speculative technical discussion often leads to actual progress. I don’t yet believe in math, but I know that I need to hang out a lot with people who do believe in math if I’m to stay on track, and that’s what I do. (Though not enough.))
Let’s just ignore questions of consciousness entirely and think in terms of decision-making systems, which may or may not be conscious, and which have sensory inputs, some self-knowledge or introspective capacity, a capacity to make causal world-models, etc (all those things can be given a purely functionalist definition).
What then does “talking in terms of simulations” mean? It means that the decision-making system needs to consider, in choosing a world-model, worlds where it (the decision-making system) exists at the physics level—at the lowest possible level of implementation, in a given ontology—and worlds where it exists at a level somewhere above lowest possible—that is, in a simulated physics several layers of abstraction removed from a fundamental physics.
I strongly doubt that you’re going to be able to derive the Born rule by just thinking about a decision theory that worries about whether you’re an nth-level simulation, and doesn’t concern itself too much with the nature of physics at the bottom level. Back on Earth, we didn’t derive the Born rule from any sort of apriori, it was chosen solely on the basis of empirical adequacy. But if you are going to derive it by reasoning about your possible place in an apriori multiverse (think Tegmark level 4), then you simply have to concern yourself with the distribute of possible bottom-level physical ontologies. Even if it turns out that simulations, and simulations of simulations, are frequent enough in the multiverse, that you must give those possibilities significant consideration, I don’t see how you can get to that stage without going through the stage of thinking about bottom-level physical ontologies.
Agreed, you need something like a basement to get a baseline, at the very least a logical basement as a Schelling point. There’s not a non-circular obvious decision theoretic reason why you or why cosmological natural selection would ‘pick’ the squared modulus as a Schelling point. But it’s sort of like property rights; we emerged out of Hobbesian anarchy somehow, and that somehow can be at least partially “explained” with game theory, social psychology, or ecology. Ultimately those all feed into each other, but I wouldn’t consider it fruitless to choose one approach and see how far it takes you. Does this analogy fail in the case of deriving the Born rule?
If you had said
that would make more sense to me, though I still don’t believe that timeless equilibria have much to do with anything. The relationship between simulatee and simulator is completely asymmetric, the simulatee is at the mercy of the simulator in the Vast majority of cases.
As for the origin of the Born rule itself, I certainly don’t believe it has an origin in terms of multiverse-appropriate decision theory. Quantum mechanics is incomplete, it’s a type of statistical mechanics that arises from some class of more fundamental theory that we haven’t yet identified, and the Born rule—that is, the feature that probabilities come from the product of a complex number with its complex conjugate—specifically results from features of that more fundamental theory; that’s how I think it works.
But doesn’t statistical mechanics also fall out of decision theory? Or are you saying that perspective is not a useful one in that it doesn’t explain the arrow of time? (I’m really tired right now, I apologize if I’m only half-responding to the things you’re actually saying.)
I don’t see how.
Are you using decision theory to refer even to the process whereby you decide what to believe, and not just the process whereby you decide what to do?
Yup. Bayesian agents aren’t good at thinking about themselves, and if you can’t think about yourself you’re in trouble when someone starts offering you bets. I feel like there must be a way in which the whole thing is ironic in a philosophically deep way but I can’t quite put my finger on it.
Basically there is ontology that reifies decision theory as fundamental and reasons about everything in terms of it. It’s a powerful ontology, and often it is a beautiful ontology. Even better it’s still inchoate and so it’s not yet as beautiful as it someday will be.