Sorry if stupid question. Let’s assume that the universe (mathematical multiverse?) gives us observations sampled from some simplicity-based distribution, like the universal distribution in UDASSA. Can that explain the initial low entropy of our universe (fewer bits to specify), and also the fact that we’re not in a tiny ordered bubble surrounded by chaos?
ETA: I see Rolf Nelson made the same point in 2007. This just makes me more puzzled why Eliezer insists on using causality, given that the causal arrow of time comes from initial low entropy of the universe in the first place, so mathematical simplicity seems to be the more fundamental thing.
A low entropy microstate takes fewer bits to specify once you’re given the macrostate to which it belongs, since low entropy macrostates are instantiated by fewer microstates than high entropy ones. But I don’t see why that should be the relevant way to determine simplicity. The extra bits are just being smuggled into the macrostate description. If you’re trying to simply specify the microstate without any prior information about the macrostate, then it seems to me that any microstate—low or high entropy—should take the same number of bits to specify, no?
If you can encode microstate s in n bits, that implies that you have a prior that assigns P(s)=2^-n. The set of all possible microstates is countably infinite. There is no such thing as a uniform distribution over a countably infinite set. Therefore, even the ignorance prior can’t assign equal length bitstrings to all microstates.
It seems to me that macrostates should take very few bits to specify (e.g. temperature and pressure), compared to microstates within a macrostate (positions of each molecule). So there would still be a difference overall.
Equilibrium macrostates (such as the homogeneous soup at the end of the universe) can be described using very few macroscopic predicates. Non-equilibrium low-entropy states cannot. No need to go back all the way to the beginning of the universe; just look at the present macrostate. No need to look at the entire universe even, just look at the room you’re in. The amount of information you’d need to specify to pick out the current macroscopic state of your room out of all the other macroscopic states of similar entropy it could be in is in fact quite high. A few predicates isn’t going to do it. The lower entropy you get, the more structure there is in the macrostate and the more information you’ll need to specify it.
Now maybe you think that the number of bits required to specify this structure, high though it may be, absolutely pales in comparison to the number of extra bits required to pick out an equilibrium microstate. But think about what you’re doing when you specify a macrostate. You’re basically just picking out a region of phase space. Once you’ve picked out the region, you then pick out a particular point within that region to specify the microstate. In terms of the number of bits required, it makes no difference whether you start our by picking a larger region and then picking a microstate within it, or if you start out by picking a smaller region. Either way, you need to rule out all points in state space but one.
Of course, you might get the illusion of simplicity if you give your small region a special name, like “Bill”. You could say, “All I need to do to pick out this region is say ‘Bill’. It’s so simple.” But then you’ve just hidden the complexity away by choosing a gerrymandered description language. I’ll do you one better by naming the precise microstate of the early universe ‘Bob’, and now I can pick out that exact state just by saying that three-letter word. We have simple names for macro-properties that are salient to us, and that we can easily pick out perceptually, but that doesn’t mean those properties are actually simple from a fundamental physics point of view. The properties of classical thermodynamics correspond to gross constraints that we can control. They are relative to our epistemic and causal capacities. They are not nature’s joints.
Either way, you need to rule out all points in state space but one.
Are you sure that all points in state space require the same number of bits to describe, if descriptions are computer programs? It seems to me that some states are more ordered and can be written out by a shorter program. For example, if all particles have zero velocity, that can be written out by a pretty short program. Equilibrium vs non-equilibrium doesn’t really come into it, e.g. a pair of very ordered objects about to collide at high speed could have a very short description and still lead to a big bang.
I agree that the constants depend on the choice of programming language, but that’s a problem for K-complexity in general. I’d love to know the solution to that...
Sorry if stupid question. Let’s assume that the universe (mathematical multiverse?) gives us observations sampled from some simplicity-based distribution, like the universal distribution in UDASSA. Can that explain the initial low entropy of our universe (fewer bits to specify), and also the fact that we’re not in a tiny ordered bubble surrounded by chaos?
ETA: I see Rolf Nelson made the same point in 2007. This just makes me more puzzled why Eliezer insists on using causality, given that the causal arrow of time comes from initial low entropy of the universe in the first place, so mathematical simplicity seems to be the more fundamental thing.
A low entropy microstate takes fewer bits to specify once you’re given the macrostate to which it belongs, since low entropy macrostates are instantiated by fewer microstates than high entropy ones. But I don’t see why that should be the relevant way to determine simplicity. The extra bits are just being smuggled into the macrostate description. If you’re trying to simply specify the microstate without any prior information about the macrostate, then it seems to me that any microstate—low or high entropy—should take the same number of bits to specify, no?
If you can encode microstate s in n bits, that implies that you have a prior that assigns P(s)=2^-n. The set of all possible microstates is countably infinite. There is no such thing as a uniform distribution over a countably infinite set. Therefore, even the ignorance prior can’t assign equal length bitstrings to all microstates.
It seems to me that macrostates should take very few bits to specify (e.g. temperature and pressure), compared to microstates within a macrostate (positions of each molecule). So there would still be a difference overall.
Equilibrium macrostates (such as the homogeneous soup at the end of the universe) can be described using very few macroscopic predicates. Non-equilibrium low-entropy states cannot. No need to go back all the way to the beginning of the universe; just look at the present macrostate. No need to look at the entire universe even, just look at the room you’re in. The amount of information you’d need to specify to pick out the current macroscopic state of your room out of all the other macroscopic states of similar entropy it could be in is in fact quite high. A few predicates isn’t going to do it. The lower entropy you get, the more structure there is in the macrostate and the more information you’ll need to specify it.
Now maybe you think that the number of bits required to specify this structure, high though it may be, absolutely pales in comparison to the number of extra bits required to pick out an equilibrium microstate. But think about what you’re doing when you specify a macrostate. You’re basically just picking out a region of phase space. Once you’ve picked out the region, you then pick out a particular point within that region to specify the microstate. In terms of the number of bits required, it makes no difference whether you start our by picking a larger region and then picking a microstate within it, or if you start out by picking a smaller region. Either way, you need to rule out all points in state space but one.
Of course, you might get the illusion of simplicity if you give your small region a special name, like “Bill”. You could say, “All I need to do to pick out this region is say ‘Bill’. It’s so simple.” But then you’ve just hidden the complexity away by choosing a gerrymandered description language. I’ll do you one better by naming the precise microstate of the early universe ‘Bob’, and now I can pick out that exact state just by saying that three-letter word. We have simple names for macro-properties that are salient to us, and that we can easily pick out perceptually, but that doesn’t mean those properties are actually simple from a fundamental physics point of view. The properties of classical thermodynamics correspond to gross constraints that we can control. They are relative to our epistemic and causal capacities. They are not nature’s joints.
Are you sure that all points in state space require the same number of bits to describe, if descriptions are computer programs? It seems to me that some states are more ordered and can be written out by a shorter program. For example, if all particles have zero velocity, that can be written out by a pretty short program. Equilibrium vs non-equilibrium doesn’t really come into it, e.g. a pair of very ordered objects about to collide at high speed could have a very short description and still lead to a big bang.
I agree that the constants depend on the choice of programming language, but that’s a problem for K-complexity in general. I’d love to know the solution to that...