Physicist and dabbler in writing fantasy/science fiction.
Ben
In some papers people write density operators using an enhanced “double ket” Dirac notation, where eg. density operators are written to look like |x>>, with two “>”’s. They do this exactly because the differential equations look more elegant.
I think in this notation measurements look like <<m|, but am not sure about that. The QuTiP software (which is very common in quantum modelling) uses something like this under-the-hood, where operators (eg density operators) are stored internally using 1d vectors, and the super-operators (maps from operators to operators) are stored as matrices.
So structuring the notation in other ways does happen, in ways that look quite reminiscent of your tensors (maybe the same).
Yes, in your example a recipient who doesn’t know the seed models the light as unpolarised, and one who does as say, H-polarised in a given run. But for everyone who doesn’t see the random seed its the same density matrix.
Lets replace that first machine with a similar one that produces a polarisation entangled photon pair, |HH> + |VV> (ignoring normalisation). If you have one of those photons it looks unpolarised (essentially your “ignorance of the random seed” can be thought of as your ignorance of the polarisation of the other photon).
If someone else (possibly outside your light cone) measures the other photon in the HV basis then half the time they will project your photon into |H> and half the time into |V>, each with 50% probability. This 50⁄50 appears in the density matrix, not the wavefunction, so is “ignorance probability”.
In this case, by what I understand to be your position, the fact of the matter is either (1) that the photon is still entangled with a distant photon, or (2) that it has been projected into a specific polarisation by a measurement on that distant photon. Its not clear when the transformation from (1) to (2) takes place (if its instant, then in which reference frame?).
So, in the bigger context of this conversation,
OP: “You live in the density matrices (Neo)”
Charlie :”No, a density matrix incorporates my own ignorance so is not a sensible picture of the fundamental reality. I can use them mathematically, but the underlying reality is built of quantum states, and that randomness when I subject them to measurements is fundamentally part of the territory, not the map. Lets not mix the two things up.”
Me: “Whether a given unit of randomness is in the map (IE ignorance), or the territory is subtle. Things that randomly combine quantum states (my first machine) have a symmetry over which underlying quantum states are being mixed that looks meaningful. Plus (this post), the randomness can move abruptly from the territory to the map due to events outside your own light cone (although the amount of randomness is conserved), so maybe worrying too much about the distinction isn’t that helpful.
What is the Bayesian argument, if one exists, for why quantum dynamics breaks the “probability is in the mind” philosophy?
In my world-view the argument is based on Bell inequalities. Other answers mention them, I will try and give more of an introduction.
First, context. We can reason inside a theory, and we can reason about a theory. The two are completely different and give different intuitions. Anyone talking about “but the complex amplitudes exist” or “we are in one Everett branch” is reasoning inside the theory. The theory, as given in the textbooks, is accepted as true and interpretations built on.
However, both historically and (I think) more generally, we should also reason about theories. This means we need to look at experimental observations, and ask questions like “what is the most reasonable model?”.
Many quantum experiments give random-looking results. As you point out, randomness is usually just “in the mind”. Reality was deterministic, but we couldn’t see everything. The terminology is “local hidden variable”. For an experiment where you draw a card from a deck the “local hidden variable” was which card was on top. In a lottery with (assumedly deterministic) pinballs the local hidden variable is some very specific details of the initial momentums and positions of the balls. In other words the local hidden variable is the thing that you don’t know, so to you it looks random. Its the seed of your pseudorandom number generator.
Entanglement—It is possible to prepare two (or more) particles in a state, such that measurements of those two particles gives very weird results. What do I mean by “very weird”. Well, in a classical setting if Alice and Bob are measuring two separate objects then there are three possible (extremal) situations (1): Their results are completely uncorrelated, for example Alice is rolling a dice in Texas and Bob is rolling a different dice in London. (2) Correlated, for example, Alice is reading an email telling her she got a job she applied for, and Bob is reading an email telling him he failed to get the same job. (4) Signalling (we skipped 3 on purpose, we will get to that). Alice and Bob have phones, and so the data they receive is related to what the other of them is doing. Linear combinations of the above (eg noisy radio messages, correlation that is nor perfect etc) are also possible.
By very weird, I mean that quantum experiments give rise (in the raw experimental data, before any theory is glued on) to a fourth type of relation; (3): Non-locality. Alice and Bob’s measurement outcomes (observations) are random, but the correlation between their observation’s changes depending on the measurements they both chose to make (inputs). Mathematically its no more complex than the others, but its fiddly to get your head around because its not something seen in everyday life.
An important feature of (3) is that it cannot be used to create signalling (4). However, (3) cannot be created out of any mixture of (1) and (2). (Just like (4) cannot be created by mixing (1) and (2)). In short, if you have any one of these 4 things, you can use local actions to go “down hill” to lower numbers but you can’t go up.
Anyway, “hidden variables” are shorthand for “(1) and (2)” (randomness and correlation). The “local” means “no signalling” (IE no (3), no radios). The reason we insist on no signalling is because the measurements Alice and Bob do on their particles could be outside one another’s light cones (so even a lightspeed signal would not be fast enough to explain the statistics). The “no signalling” condition might sound artificial, but if you allow faster than light signalling then you are (by the standards of relativity) also allowing time travel.
Bell inequality experiments have been done. They measure result (3). (3) cannot be made out of ordinary “ignorance” probabilities (cannot be made from (2)). (3) could be made out of (4) (faster than light signalling), but we don’t see the signalling itself, and assuming it exists entails time travel.
So, if we reject signalling, we know that whatever it is that is happening in a Bell inequality experiment it can’t be merely apparent randomness due to our ignorance. We also know the individual results collected by Alice and Bob look random (but not the correlations between the results), this backs us into the corner of accepting that the randomness is somehow an intrinsic feature of the world, even the photon didn’t “know” if it would go through the polariser until you tried it.
The wiki article on Bell inequalities isn’t very good unfortunately.
Just the greentext. Yes, I totally agree that the study probably never happened. I just engaged with the actualy underling hypothesis, and to do so felt like some summary of the study helped. But I phrased it badly and it seems like I am claiming the study actually happened. I will edit.
I thought they were typically wavefunction to wavefunction maps, and they need some sort of sandwiching to apply to density matrices?
Yes, this is correct. My mistake, it does indeed need the sandwiching like this .
From your talk on tensors, I am sure it will not surprise you at all to know that the sandwhich thing itself (mapping from operators to operators) is often called a superoperator.
I think the reason it is as it is is their isn’t a clear line between operators that modify the state and those that represent measurements. For example, the Hamiltonian operator evolves the state with time. But, taking the trace of the Hamiltonian operator applied to the state gives the expectation value of the energy.
The way it works normally is that you have a state , and its acted on by some operator, , which you can write as . But this doesn’t give a number, it gives a new state like the old but different. (For example if a was the anhilation operator the new state is like the old state but with one fewer photons). This is how (for example) an operator acts on the state of the system to change that state. (Its a density matrix to density matrix map).
In dimensions terms this is: (1,1) = (1, 1) * (1,1)
(Two square matrices of size N multiply to give another square matrix of size N).
However, to get the expected outcome of a measurement on a particular state you take : where Tr is the trace. The trace basically gets the “plug” at the left hand side of a matrix and twists it around to plug it into the right hand side. So overall what is happening is that the operators and , each have shapes (1,1) and what we do is:
Tr( (1,1) * (1,1)) = Tr( (1, 1) ) = number.
The “inward facing” dimensions of each matrix get plugged into one another because the matrices multiply, and the outward facing dimensions get redirected by the trace operation to also plug into one another. (The Trace is like matrix multiplication but on paper that has been rolled up into a cylinder, so each of the two matrices inside sees the other on both sides). The net effect is exactly the same as if they had originally been organized into the shapes you suggest of (2,0) and (0,2) respectively.
So if the two “ports” are called A and B your way of doing it gives:
(AB, 0) * (0, AB) = (0, 0) IE number
The traditional way:
Tr( (A, B) * (B, A) ) = Tr( (A, A) ) = (0, 0) , IE number.
I haven’t looked at tensors much but I think that in tensor-land this Trace operation takes the role of a really boring metric tensor that is just (1,1,1,1...) down the diagonal.
So (assuming I understand right) your way of doing it is cleaner and more elegant for getting the expectation value of a measurement. But the traditional system works more elegantly for applying an operator too a state to evolve it into another state.
You are completely correct in the “how does the machine work inside?” question. As you point out that density matrix has the exact form of something that is entangled with something else.
I think its very important to be discussing what is real, although as we always have a nonzero inferential distance between ourselves and the real the discussion has to be a little bit caveated and pragmatic.
I think the reason is that in quantum physics we also have operators representing processes (like the Hamiltonian operator making the system evolve with time, or the position operator that “measures” position, or the creation operator that adds a photon), and the density matrix has exactly the same mathematical form as these other operators (apart from the fact the density matrix needs to be normalized).
But that doesn’t really solve the mystery fully, because they could all just be called “matrices” or “tensors” instead of “operators”. (Maybe it gets us halfway to an explanation, because all of the ones other than the density operator look like they “operate” on the system to make it change its state.)
Speculatively, it might be to do with the fact that some of these operators are applied on continuous variables (like position), where the matrix representation has infinite rows and infinite columns—maybe their is some technicality where if you have an object like that you have to stop using the word “matrix” or the maths police lock you up.
There are some non-obvious issues with saying “the wavefunction really exists, but the density matrix is only a representation of our own ignorance”. Its a perfectly defensible viewpoint, but I think it is interesting to look at some of its potential problems:
A process or machine prepares either |0> or |1> at random, each with 50% probability. Another machine prepares either |+> or |-> based on a coin flick, where |+> = (|0> + |1>)/root2, and |+> = (|0> - |1>)/root2. In your ontology these are actually different machines that produce different states. In contrast, in the density matrix formulation these are alternative descriptions of the same machine. In any possible experiment, the two machines are identical. Exactly how much of a problem this is for believing in wavefuntions but not density matrices is debatable—“two things can look the same, big deal” vs “but, experiments are the ultimate arbiters of truth, if experiemnt says they are the same thing then they must be and the theory needs fixing.”
There are many different mathematical representations of quantum theory. For example, instead of states in Hilbert space we can use quasi-probability distributions in phase space, or path integrals. The relevance to this discussion is that the quasi-probability distributions in phase space are equivalent to density matrices, not wavefunctions. To exaggerate the case, imagine that we have a large number of different ways of putting quantum physics into a mathematical language, [A, B, C, D....] and so on. All of them are physically the same theory, just couched in different mathematics language, a bit like say, [“Hello”, “Hola”, “Bonjour”, “Ciao”...] all mean the same thing in different languages. But, wavefunctions only exist as an entity separable from density matrices in some of those descriptions. If you had never seen another language maybe the fact that the word “Hello” contains the word “Hell” as a substring might seem to possibly correspond to something fundamental about what a greeting is (after all, “Hell is other people”). But its just a feature of English, and languages with an equal ability to greet don’t have it. Within the Hilbert space language it looks like wavefunctions might have a level of existence that is higher than that of density matrices, but why are you privileging that specific language over others?
In a wavefunction-only ontology we have two types of randomness, that is normal ignorance and the weird fundamental quantum uncertainty. In the density matrix ontology we have the total probability, plus some weird quantum thing called “coherence” that means some portion of that probability can cancel out when we might otherwise expect it to add together. Taking another analogy (I love those), the split you like is [100ml water + 100ml oil], (but water is just my ignorance and doesn’t really exist), and you don’t like the density matrix representation of [200ml fluid total, oil content 50%]. Their is no “problem” here per se but I think it helps underline how the two descriptions seem equally valid. When someone else measures your state they either kill its coherence (drop oil % to zero), or they transform its oil into water. Equivalent descriptions.
All of that said, your position is fully reasonable, I am just trying to point out that the way density matrices are usually introduced in teaching or textbooks does make the issue seem a lot more clear cut than I think it really is.
I just looked up the breakfast hypothetical. Its interesting, thanks for sharing it.
So, my understanding is (supposedly) someone asked a lot of prisoners “How would you feel if you hadn’t had breakfast this morning?”, did IQ tests on the same prisoners and found that the ones who answered “I did have breakfast this morning.” or equivalent were on average very low in IQ. (Lets just assume for the purposes of discussion that this did happen as advertised.)
It is interesting. I think in conversation people very often hear the question they were expecting, and if its unexpected enough they hear the words rearranged to make it more expected. There are conversations where the question could fit smoothly, but in most contexts its a weird question that would mostly be measuring “are people hearing what they expect, or what is being actually said”. This may also correlate strongly with having English as a second language.
I find the idea “dumb people just can’t understand a counterfactual” completely implausible. Without a counterfactual you can’t establish causality. Without causality their is no way of connecting action to outcome. How could such a person even learn to use a TV remote? Given that these people (I assume) can operate TV remotes they must in fact understand counterfactuals internally, although its possible they lack the language skills to clearly communicate about them.
The question of “why should the observed frequencies of events be proportional to the square amplitudes” is actually one of the places where many people perceive something fishy or weird with many worlds. [https://www.sciencedirect.com/science/article/pii/S1355219809000306 ]
To clarify, its not a question of possibly rejecting the square-amplitude Born Rule while keeping many worlds. Its a question of whether the square-amplitude Born Rule makes sense within the many worlds perspective, and it if doesn’t what should be modified about the many worlds perspective to make it make sense.
I agree with this. Its something about the guilt that makes this work. Also the sense that you went into it yourself somehow reshapes the perception.
I think the loan shark business model maybe follows the same logic. [If you are going to eventually get into a situation where the victim pays or else suffers violence, then why doesn’t the perpetrator just skip the costly loan step at the beginning and go in threat first? I assume that the existence of loan sharks (rather than just blackmailers) proves something about how if people feel like they made a bad choice or engaged willingly at some point they are more susceptible. Or maybe its frog boiling.]
On the “what did we start getting right in the 1980′s for reducing global poverty” I think most of the answer was a change in direction of China. In the late 70′s they started reforming their economy (added more capitalism, less command economy): https://en.wikipedia.org/wiki/Chinese_economic_reform.
Comparing this graph on wiki https://en.wikipedia.org/wiki/Poverty_in_China#/media/File:Poverty_in_China.svg , to yours, it looks like China accounts for practically all of the drop in poverty since the 1980s.
Arguably this is a good example for your other points. More willing participation, less central command.
I don’t think the framing “Is behaviour X exploitation?” is the right framing. It takes what (should be) an argument about morality and instead turns it into an argument about the definition of the word “exploitation” (where we take it as given that, whatever the hell we decide exploitation “actually means” it is a bad thing). For example see this post: https://www.lesswrong.com/posts/yCWPkLi8wJvewPbEp/the-noncentral-fallacy-the-worst-argument-in-the-world. Once we have a definition of “exploitation” their might be some weird edge cases that are technically exploitation but are obviously fine.
The substantial argument (I think) is that when two parties have unequal bargaining positions, is it OK for the stronger party to get the best deal it can? A full-widget is worth a million dollars. I possess the only left half of a widget in the world. Ten million people each possess a right half that could doc with my left half. Those not used to make widgets are worthless. What is the ethical split for me to offer for a right half in this case?
[This is maybe kind of equivalent to the dating example you give. At least in my view the “bad thing” in the dating example is the phrase “She begins using this position to change the relationship”. The word “change” is the one that sets the alarms for me. If they both went in knowing what was going on then, to me, that’s Ok. Its the “trap” that is not. I think most of the things we would object to are like this, those Monday meetings and that expensive suit are implied to be surprises jumped onto poor Bob.]
The teapot comparison (to me) seems to be a bad. I got carried away and wrote a wall of text. Feel free to ignore it!
First, lets think about normal probabilities in everyday life. Sometimes there are more ways for one state to come about that another state, for example if I shuffle a deck of cards the number of orderings that look random is much larger than the number of ways (1) of the cards being exactly in order.
However, this manner of thinking only applies to certain kinds of thing—those that are in-principle distinguishable. If you have a deck of blank cards, there is only one possible order, BBBBBB.… To take another example, an electronic bank account might display a total balance of $100. How many different ways are their for that $100 to be “arranged” in that bank account? The same number as 100 coins labelled “1″ through “100”? No, of course not. Its just an integer stored on a computer, and their is only one way of picking out the integer 100. The surprising examples of this come from quantum physics, where photons act more like the bank account, where their is only 1 way of a particular mode to contain 100 indistinguishable photons. We don’t need to understand the standard model for this, even if we didn’t have any quantum theory at all we could still observe these Boson statistics in experiments.
So now, we encounter anthropic arguments like Doomsday. These arguments are essentially positing a distribution, where we take the exact same physical universe and its entire physical history from beginning to end, (which includes every atom, every synapse firing and so on). We then look at all of the “counting minds” in that universe (people count, ants probably don’t, aliens, who knows), and we create a whole slew of “subjective universes”, , , , , etc, where each of of them is atomically identical to the original but “I” am born as a different one of those minds (I think these are sometimes called “centred worlds”). We assume that all of these subjective universes were, in the first place, equally likely, and we start finding it a really weird coincidence that in the one we find ourselves in we are a human (instead of an Ant), or that we are early in history. This is, as I understand it, The Argument. You can phrase it without explicitly mentioning the different s, by saying “if there are trillions of people in the future, the chances of me being born in the present are very low. So, the fact I was born now should update me away from believing there will be trillions of people in the future”. - but the s are still doing all the work in the background.
The conclusion depends on treating all those different subscripted s as distinguishable, like we would for cards that had symbols printed on them. But, if all the cards in the deck are identical there is only one sequence possible. I believe that all of the , , , ’s etc are identical in this manner. By assumption they are atomically identical at all times in history, they differ only by which one of the thinking apes gets assigned the arbitrary label “me”—which isn’t physically represented in any particle. You think they look different, and if we accept that we can indeed make these arguments, but if you think they are merely different descriptions of the same exact thing then the Doomsday argument no longer makes sense, and possibly some other anthropic arguments also fall apart. I don’t think they do look different, if every “I” in the universe suddenly swapped places—but leaving all memories and personality behind in the physical synapses etc, then, how would I even know it? I would be a cyborg fighting in WWXIV and would have no memories of ever being some puny human typing on a web forum in the 21s Cent. Instead of imaging that I was born as someone else I could imagine that I could wake up as someone else, and in any case I wouldn’t know any different.
So, at least to me, it looks like the anthropic arguments are advancing the idea of this orbital teapot (the different scripted s, although it is, in fairness, a very conceptually plausible teapot). There are, to me, three possible responses:
1 - This set of different worlds doesn’t logically exist. You could push this for this response by arguing “I couldn’t have been anyone but me, by definition.” [Reject the premise entirely—there is no teapot]
2 - This set of different worlds does logically make sense, and after accepting it I see that it is a suspicious coincidence I am so early in history and I should worry about that. [accept the argument—there is a ceramic teapot orbiting Mars]
3 - This set of different worlds does logically make sense, but they should be treated like indistinguishable particles, blank playing cards or bank balances. [accept the core premise, but question its details in a way that rejects the conclusion—there is a teapot, but its chocolate, not ceramic.].
So, my point (after all that, Sorry!) is that I don’t see any reason why (2) is more convincing that (3).
[For me personally, I don’t like (1) because I think it does badly in cases where I get replicated in the future (eg sleeping beauty problems, or mind uploads or whatever). I reject (2) because the end result of accepting it is that I can infer information through evidence that is not causally linked to the information I gain (eg. I discover that the historical human population was much bigger than previously reported, and as a result I conclude the apocalypse is further in the future than I previously supposed). This leads me to thinking (3) seems right-ish, although I readily admit to being unsure about all this.].
I found this post to be a really interesting discussion of why organisms that sexually reproduce have been successful and how the whole thing emerges. I found the writing style, where it switched rapidly between relatively serious biology and silly jokes very engaging.
Many of the sub claims seem to be well referenced (I particularly liked the swordless ancestor to the swordfish liking mates who had had artificial swords attached).
“Stock prices represent the market’s best guess at a stock’s future price.”
But they are not the same as the market’s best guess at its future price. If you have a raffle ticket that will, 100% for definite, win $100 when the raffle happens in 10 years time, the the market’s best guess of its future price is $100, but nobody is going to buy it for $100, because $100 now is better than $100 in 10 years.
Whatever it is that people think the stock will be worth in the future, they will pay less than that for it now. (Because $100 in the future isn’t as good as just having the money now). So even if it was a cosmic law of the universe that all companies become more productive over time, and everyone knew this to be true, the stocks in those companies would still go up over time, like the raffle ticket approaching the pay day.
Toy example:
1990 - Stocks in C cost $10. Everyone thinks they will be worth $20 by the year 2000, but 10 years is a reasonably long time to wait to double your money so these two things (the expectation of 20 in the future, and the reality of 10 now) coexist without contradiction.
2000 - Stocks in C now cost $20, as expected. People now think that by 2010 they will be worth $40.
Other Ant-worriers are out there!
“”it turned out this way, so I guess it had to be this way” doesn’t resolve my confusion”
Sorry, I mixed the position I hold (that they maybe work like bosons) and the position I was trying to argue for, which was an argument in favor of confusion.
I can’t prove (or even strongly motivate) my “the imaginary mind-swap procedure works like a swap of indistinguishable bosons” assumption, but, as far as I know no one arguing for Anthropic arguments can prove (or strongly motivate) the inverse position—which is essential for many of these arguments to work. I agree with you that we don’t have a standard model of minds, and without such a model the Doomsday Argument, and the related problem of being cosmically early might not be problems at all.
Interestingly, I don’t think the weird boson argument actually does anything for worries about whether we are simulations, or Boltzmann brains—those fears (I think) survive intact.
I suspect there is a large variation between countries in how safely taxi drivers drive relative to others.
In London my impression is that the taxis are driven more safely than non-taxis. In Singapore it appears obvious to casual observation that taxis are much less safely driven than most of the cars.
That is very interesting! That does sound weird.