My guess is that the Born’s Rule is related to the Solomonoff Prior. Consider a program P that takes 4 inputs:
boundary conditions for a wavefunction
a time coordinate T
a spatial region R
a random string
What P does is take the boundary conditions, use Schrödinger’s equation to compute the wavefunction at time T, then sample the wavefunction using the Born probabilities and the random input string, and finally output the particles in the region R and their relative positions.
Suppose this program, along with the inputs that cause it to output the description of a given human brain, is what makes the largest contribution to the probability mass of the bitstring representing that brain in the Solomonoff Prior. This seems like a plausible conjecture (putting aside the fact that quantum mechanics isn’t actually the TOE of this universe).
(Does anyone think this is not true, or if it is true, has nothing to do with the answer to the mystery of “why squared amplitudes”?)
This idea seems fairly obvious, but I don’t recall seeing it proposed by anyone yet. One possible direction to explore is to try to prove that any modification to Born’s rule would cause a drastic decrease in the probability that P, given random inputs, would output the description of a sentient being. But I have no idea how to go about doing this. I’m also not sure how to develop this observation/conjecture into a full answer of the mystery.
I just read in Scott Aaronson’s Quantum Computing, Postselection, and Probabilistic Polynomial-Time that if the exponent in the probability rule was anything other than 2, then we’d be able to do postselection without quantum suicide and solve problems in PP. (See Page 6, Theorem 6.) The same is true if quantum mechanics was non-linear.
Given that, my conjecture is implied by one that says “sentience is unlikely to evolve in a world where problems in PP (which is probably strictly harder than PH, which is probably strictly harder than NP) can be easily solved” (presumably because intelligence wouldn’t be useful in such a world).
Interesting. What would such a world look like? I imagine instead of a selection pressure for intelligence there would be a selection pressure for raw memory, so that you could perfectly model any creature with less memory than yourself. It seems that this would be a very intense pressure, since the upper hand is essentially guaranteed superiority, and you would ultimately wind up with galaxy sized computers running through all possible simulations of other galaxy sized computers.
I never put much stock in the simulation hypothesis, because I couldn’t see why an entity capable of simulating our universe would derive any value from doing it. This scenario makes me rethink that a little.
In any case, while this is another potential reason why the rule must be 2 in our universe, it still doesn’t shed any light on the mechanism by which our subjective experience follows this rule.
I don’t know. I don’t have a very good understanding of regular quantum computing, much less the non-Born “fantasy” quantum computers that Aaronson used in his paper. But I’m going to guess that your speculation is probably wrong, unless you happen to be an expert in this area. These things tend not to be very intuitive at all.
Suppose this program, along with the inputs that cause it to output the description of a given human brain, is what makes the largest contribution to the probability mass of the bitstring representing that brain in the Solomonoff Prior.
More specifically, to replace my previous summary comment: the above statement sounds kind-a redeemable, but it’s so vague and common-sensually absurd that I think it makes a negative contribution. Things like this need to be said clearly, or not at all. It invites all sorts of kookery, not just with the format of presentation, but in own mind as well.
Huh, that’s a surprising response. I thought that at least the intended meaning would be obvious for someone familiar with the Solomonoff Prior. I guess “vague” I can address by making my claim mathematically precise, but why “common-sensually absurd”?
I was hoping that it would trigger an insight in someone who might solve this mystery for me. As I said, I’m not sure how to develop it into a full answer myself (but it might be related to this other vague/possibly-absurd idea).
Perhaps I’m abusing this community by presenting ideas that are half-formed and “epistemically unhygienic”, but I expect that’s not a serious danger. It seems like a promising direction to explore, that I don’t see anyone else exploring (kind of like UDT until recently). I have too many questions I’d like to see answered, and not enough time and ability to answer them all myself.
Hello Wei Dai. Your paradigm is a bit opaque to me. There’s a cosmology here which involves programs, program outputs, and probability distributions over each, but I can’t tell what’s supposed to exist. Just the program outputs? The program outputs and the programs? Does the program correspond to “basic physical law”, and program output to “the physical world”?
If I try to abstract away from the metaphysical idiosyncrasies, the idea seems to be that Born’s rule is true because the worlds which function according to Born’s rule are the majority of the worlds in which sentient beings show up. Well, it could be true. But here’s an interesting Bohmian fact: if you start out with an ensemble of Bohmian worlds deviating from the Born distribution, they will actually converge on it, solely due to Bohmian dynamics. (See quant-ph/0403034.) So something like the Bohmian equation of motion may actually be the more fundamental fact.
In general, I think what exists are mathematical structures, which include computations as a subclass.
But here’s an interesting Bohmian fact: if you start out with an ensemble of Bohmian worlds deviating from the Born distribution, they will actually converge on it, solely due to Bohmian dynamics.
Thanks for the link. That looks interesting, and I have a couple of questions that maybe you help me with.
Why do they converge to the Born distribution? The authors make an analogy with thermal relaxation, but there is a standard explanation of the second law of thermodynamics in terms of sizes of macrostates in configuration space, and I don’t see what the equivalent explanation is for Bohmian relaxation.
What about decoherence? Suppose you have a wavefunction that has decohered into two approximately non-interacting branches occupying different parts of configuration space. If you start with a Bohmian world that belongs to one branch, then in all likelihood its future evolution will stay within that branch, right? Now if you take an ensemble of Bohmian worlds that all belong to that branch, how will it converge to the Born distribution, which occupies both branches?
This is more of an objection to the Bohmian ontology than a question. If you look at Bohmian Mechanics as a computation, it consists of two parts: (1) evolution of the wavefunction, and (2) evolution of a point in configuration space, guided by the wavefunction. But it seems like all of the real work is being done in part 1. If you wanted to simulate a quantum system, for example, it seems sufficient to just do part 1, and then sample the resulting wavefunction according to Born’s rule, and part 2 adds more complexity and computational burden without any apparent benefit.
Let’s distinguish two versions of this question. First version: why does a generic non-Born ensemble of Bohmian worlds tend to become Born-like? I think the technical answer is to be found in footnote 9 and the discussion around equation 20. But ultimately I think it will come back to a Liouville theorem in the space of distributions. There is some natural metric under which the Born-like distributions are the majority. (Or perhaps it is that non-Born regions are traversed relatively quickly.)
Second version: why does an individual Bohmian world contain a Born distribution of outcomes? This follows from the first part. An individual Bohmian world consists of a universal wavefunction and a quasiclassical trajectory. If you pick just a few of the classical variables, you can construct a corresponding reduced density matrix in the usual fashion, and a reduced Bohmian equation of motion in which the evolution of those variables depends on that density matrix and on influences coming from all the degrees of freedom that were traced over. So when you look at all the instances, within a single Bohmian history, of a particular physical process, you are looking at an ensemble of noisy Bohmian microhistories. The argument above suggests that even if this starts as a non-Born ensemble, it will evolve into a Born-like ensemble. The only complication is the noise factor. But it is at least plausible that in the majority of Bohmian worlds, this nonlocal noise is just noise and does not introduce an anti-Born tendency.
From an all-worlds-exist perspective, which we both favor, I would summarize as follows: (1) the Born distribution is the natural measure on the subset of worlds consisting of the Bohmian worlds (2) most Bohmian worlds will exhibit an internal Born distribution of physical outcomes. At present these are conjectures rather than theorems, but I would consider them plausible conjectures in the light of Valentini’s work.
“What about decoherence?”
As we’ve just discussed, Bohmian dynamics both preserves exact Born distributions and evolves non-Born distributions towards Born-like distributions (and this is true for subsystems of a Bohmian world as well as for the whole). So the sub-ensembles in the decohered branches will preserve or evolve towards Born.
“part 2 adds more complexity and computational burden without any apparent benefit”
This is a complicated matter to discuss, not least because there is an interpretation of Bohmian mechanics, the nomological interpretation, according to which the “wavefunction” is a law of motion and not a thing. In nomological Bohmian mechanics, the configuration is all that exists, evolving according to a nonlocal potential.
Is it possible (I’m not sure it makes sense to ask about easy) under our physics to build an intelligence that optimizes (or at least a structure that propagates itself) according to some metric other than the Born Rule? If not, then it should be anthropically unsurprising that we perceive probability as squared amplitude, even if there is no law of physics to that effect. Otoh if it is possible, then you could have a TOE from which you can’t derive how to compute probability, and there’s nothing wrong with that, because then there really is another way to interpret probability that other people in the universe (though of course not in our Everett branch) may be using.
That seems like a general argument against the whole Solomonoff Induction approach. I’d be happy to see the dependence on an encoding of algorithms removed, but until someone finds a way to do so, it doesn’t seem to be a deal-breaker. I think my claim should apply to any encoding of algorithms one might use that isn’t contrived specifically to make it false.
My guess is that the Born’s Rule is related to the Solomonoff Prior. Consider a program P that takes 4 inputs:
boundary conditions for a wavefunction
a time coordinate T
a spatial region R
a random string
What P does is take the boundary conditions, use Schrödinger’s equation to compute the wavefunction at time T, then sample the wavefunction using the Born probabilities and the random input string, and finally output the particles in the region R and their relative positions.
Suppose this program, along with the inputs that cause it to output the description of a given human brain, is what makes the largest contribution to the probability mass of the bitstring representing that brain in the Solomonoff Prior. This seems like a plausible conjecture (putting aside the fact that quantum mechanics isn’t actually the TOE of this universe).
(Does anyone think this is not true, or if it is true, has nothing to do with the answer to the mystery of “why squared amplitudes”?)
This idea seems fairly obvious, but I don’t recall seeing it proposed by anyone yet. One possible direction to explore is to try to prove that any modification to Born’s rule would cause a drastic decrease in the probability that P, given random inputs, would output the description of a sentient being. But I have no idea how to go about doing this. I’m also not sure how to develop this observation/conjecture into a full answer of the mystery.
I just read in Scott Aaronson’s Quantum Computing, Postselection, and Probabilistic Polynomial-Time that if the exponent in the probability rule was anything other than 2, then we’d be able to do postselection without quantum suicide and solve problems in PP. (See Page 6, Theorem 6.) The same is true if quantum mechanics was non-linear.
Given that, my conjecture is implied by one that says “sentience is unlikely to evolve in a world where problems in PP (which is probably strictly harder than PH, which is probably strictly harder than NP) can be easily solved” (presumably because intelligence wouldn’t be useful in such a world).
Interesting. What would such a world look like? I imagine instead of a selection pressure for intelligence there would be a selection pressure for raw memory, so that you could perfectly model any creature with less memory than yourself. It seems that this would be a very intense pressure, since the upper hand is essentially guaranteed superiority, and you would ultimately wind up with galaxy sized computers running through all possible simulations of other galaxy sized computers.
I never put much stock in the simulation hypothesis, because I couldn’t see why an entity capable of simulating our universe would derive any value from doing it. This scenario makes me rethink that a little.
In any case, while this is another potential reason why the rule must be 2 in our universe, it still doesn’t shed any light on the mechanism by which our subjective experience follows this rule.
I don’t know. I don’t have a very good understanding of regular quantum computing, much less the non-Born “fantasy” quantum computers that Aaronson used in his paper. But I’m going to guess that your speculation is probably wrong, unless you happen to be an expert in this area. These things tend not to be very intuitive at all.
I honestly can’t imagine my evolution story is right. It just seemed like an immensely fun opportunity for speculation.
More specifically, to replace my previous summary comment: the above statement sounds kind-a redeemable, but it’s so vague and common-sensually absurd that I think it makes a negative contribution. Things like this need to be said clearly, or not at all. It invites all sorts of kookery, not just with the format of presentation, but in own mind as well.
Huh, that’s a surprising response. I thought that at least the intended meaning would be obvious for someone familiar with the Solomonoff Prior. I guess “vague” I can address by making my claim mathematically precise, but why “common-sensually absurd”?
Re absurd: It’s not clear why you would say something like the quote.
I was hoping that it would trigger an insight in someone who might solve this mystery for me. As I said, I’m not sure how to develop it into a full answer myself (but it might be related to this other vague/possibly-absurd idea).
Perhaps I’m abusing this community by presenting ideas that are half-formed and “epistemically unhygienic”, but I expect that’s not a serious danger. It seems like a promising direction to explore, that I don’t see anyone else exploring (kind of like UDT until recently). I have too many questions I’d like to see answered, and not enough time and ability to answer them all myself.
Hello Wei Dai. Your paradigm is a bit opaque to me. There’s a cosmology here which involves programs, program outputs, and probability distributions over each, but I can’t tell what’s supposed to exist. Just the program outputs? The program outputs and the programs? Does the program correspond to “basic physical law”, and program output to “the physical world”?
If I try to abstract away from the metaphysical idiosyncrasies, the idea seems to be that Born’s rule is true because the worlds which function according to Born’s rule are the majority of the worlds in which sentient beings show up. Well, it could be true. But here’s an interesting Bohmian fact: if you start out with an ensemble of Bohmian worlds deviating from the Born distribution, they will actually converge on it, solely due to Bohmian dynamics. (See quant-ph/0403034.) So something like the Bohmian equation of motion may actually be the more fundamental fact.
In general, I think what exists are mathematical structures, which include computations as a subclass.
Thanks for the link. That looks interesting, and I have a couple of questions that maybe you help me with.
Why do they converge to the Born distribution? The authors make an analogy with thermal relaxation, but there is a standard explanation of the second law of thermodynamics in terms of sizes of macrostates in configuration space, and I don’t see what the equivalent explanation is for Bohmian relaxation.
What about decoherence? Suppose you have a wavefunction that has decohered into two approximately non-interacting branches occupying different parts of configuration space. If you start with a Bohmian world that belongs to one branch, then in all likelihood its future evolution will stay within that branch, right? Now if you take an ensemble of Bohmian worlds that all belong to that branch, how will it converge to the Born distribution, which occupies both branches?
This is more of an objection to the Bohmian ontology than a question. If you look at Bohmian Mechanics as a computation, it consists of two parts: (1) evolution of the wavefunction, and (2) evolution of a point in configuration space, guided by the wavefunction. But it seems like all of the real work is being done in part 1. If you wanted to simulate a quantum system, for example, it seems sufficient to just do part 1, and then sample the resulting wavefunction according to Born’s rule, and part 2 adds more complexity and computational burden without any apparent benefit.
“Why do they converge to the Born distribution?”
Let’s distinguish two versions of this question. First version: why does a generic non-Born ensemble of Bohmian worlds tend to become Born-like? I think the technical answer is to be found in footnote 9 and the discussion around equation 20. But ultimately I think it will come back to a Liouville theorem in the space of distributions. There is some natural metric under which the Born-like distributions are the majority. (Or perhaps it is that non-Born regions are traversed relatively quickly.)
Second version: why does an individual Bohmian world contain a Born distribution of outcomes? This follows from the first part. An individual Bohmian world consists of a universal wavefunction and a quasiclassical trajectory. If you pick just a few of the classical variables, you can construct a corresponding reduced density matrix in the usual fashion, and a reduced Bohmian equation of motion in which the evolution of those variables depends on that density matrix and on influences coming from all the degrees of freedom that were traced over. So when you look at all the instances, within a single Bohmian history, of a particular physical process, you are looking at an ensemble of noisy Bohmian microhistories. The argument above suggests that even if this starts as a non-Born ensemble, it will evolve into a Born-like ensemble. The only complication is the noise factor. But it is at least plausible that in the majority of Bohmian worlds, this nonlocal noise is just noise and does not introduce an anti-Born tendency.
From an all-worlds-exist perspective, which we both favor, I would summarize as follows: (1) the Born distribution is the natural measure on the subset of worlds consisting of the Bohmian worlds (2) most Bohmian worlds will exhibit an internal Born distribution of physical outcomes. At present these are conjectures rather than theorems, but I would consider them plausible conjectures in the light of Valentini’s work.
“What about decoherence?”
As we’ve just discussed, Bohmian dynamics both preserves exact Born distributions and evolves non-Born distributions towards Born-like distributions (and this is true for subsystems of a Bohmian world as well as for the whole). So the sub-ensembles in the decohered branches will preserve or evolve towards Born.
“part 2 adds more complexity and computational burden without any apparent benefit”
This is a complicated matter to discuss, not least because there is an interpretation of Bohmian mechanics, the nomological interpretation, according to which the “wavefunction” is a law of motion and not a thing. In nomological Bohmian mechanics, the configuration is all that exists, evolving according to a nonlocal potential.
Is it possible (I’m not sure it makes sense to ask about easy) under our physics to build an intelligence that optimizes (or at least a structure that propagates itself) according to some metric other than the Born Rule? If not, then it should be anthropically unsurprising that we perceive probability as squared amplitude, even if there is no law of physics to that effect. Otoh if it is possible, then you could have a TOE from which you can’t derive how to compute probability, and there’s nothing wrong with that, because then there really is another way to interpret probability that other people in the universe (though of course not in our Everett branch) may be using.
Fair rephrasing?
The Solomonoff prior depends on the encoding of algorithms, the Born rule doesn’t. Or am I missing anything?
That seems like a general argument against the whole Solomonoff Induction approach. I’d be happy to see the dependence on an encoding of algorithms removed, but until someone finds a way to do so, it doesn’t seem to be a deal-breaker. I think my claim should apply to any encoding of algorithms one might use that isn’t contrived specifically to make it false.
Epistemic hygiene alert!