A simple example is that in a closed container filled with gas it’s possible for all the gas molecules to spontaneously move to one side of the container. This temporarily increases the order but has nothing to do with entropy.
I think you’re ignoring the difference between the Boltzmann and Gibbs entropy, both here and in your original comment. This is going to be long, so I apologize in advance.
Gibbs entropy is a property of ensembles, so it doesn’t change when there is a spontaneous fluctuation towards order of the type you describe. As long as the gross constraints on the system remain the same, the ensemble remains the same, so the Gibbs entropy doesn’t change. And it is the Gibbs entropy that is most straightforwardly associated with the Shannon entropy. If you interpret the ensemble as a probability distribution over phase space, then the Gibbs entropy of the ensemble is just the Shannon entropy of the distribution (ignoring some irrelevant and anachronistic constant factors). Everything you’ve said in your comments is perfectly correct, if we’re talking about Gibbs entropy.
Boltzmann entropy, on the other hand, is a property of regions of phase space, not of ensembles or distributions. The famous Boltzmann formula equates entropy with the logarithm of the volume of a region in phase space. Now, it’s true that corresponding to every phase space region there is an ensemble/distribution whose Shannon entropy is identical to the Boltzmann entropy, namely the distribution that is uniform in that region and zero elsewhere. But the converse isn’t true. If you’re given a generic ensemble or distribution over phase space and also some partition of phase space into regions, it need not be the case that the Shannon entropy of the distribution is identical to the Boltzmann entropy of any of the regions.
So I don’t think it’s accurate to say that Boltzmann and Shannon entropy are the same concept. Gibbs and Shannon entropy are the same, yes, but Boltzmann entropy is a less general concept. Even if you interpret Boltzmann entropy as a property of distributions, it is only identical to the Shannon entropy for a subset of possible distributions, those that are uniform in some region and zero elsewhere.
As for the question of whether Boltzmann entropy can decrease spontaneously in a closed system—it really depends on how you partition phase space into Boltzmann macro-states (which are just regions of phase space, as opposed to Gibbs macro-states, which are ensembles). If you define the regions in terms of the gross experimental constraints on the system (e.g. the volume of the container, the external pressure, the external energy function, etc.), then it will indeed be true that the Boltzmann entropy can’t change without some change in the experimental constraints. Trivially true, in fact. As long as the constraints remain constant, the system remains within the same Boltzmann macro-state, and so the Boltzmann entropy must remain the same.
However, this wasn’t how Boltzmann himself envisioned the partitioning of phase space. In his original “counting argument” he partitioned phase space into regions based on the collective properties of the particles themselves, not the external constraints. So from his point of view, the particles all being scrunched up in one corner of the container is not the same macro-state as the particles being uniformly spread throughout the container. It is a macro-state (region) of smaller volume, and therefore of lower Boltzmann entropy. So if you partition phase space in this manner, the entropy of a closed system can decrease spontaneously. It’s just enormously unlikely. It’s worth noting that subsequent work in the Boltzmannian tradition, ranging from the Ehrenfests to Penrose, has more or less adopted Boltzmann’s method of delineating macrostates in terms of the collective properties of the particles, rather than the external constraints on the system.
Boltzmann’s manner of talking about entropy and macro-states seems necessary if you want to talk about the entropy of the universe as a whole increasing, which is something Carroll definitely wants to talk about. The increase in the entropy of the universe is a consequence of spontaneous changes in the configuration of its constituent particles, not a consequence of changing external constraints (unless you count the expansion of the universe, but that is not enough to fully account for the change in entropy on Carroll’s view).
This is going to be a somewhat technical reply, but here goes anyway.
Boltzmann entropy, on the other hand, is a property of regions of phase space, not of ensembles or distributions. The famous Boltzmann formula equates entropy with the logarithm of the volume of a region in phase space. Now, it’s true that corresponding to every phase space region there is an ensemble/distribution whose Shannon entropy is identical to the Boltzmann entropy, namely the distribution that is uniform in that region and zero elsewhere.
You cannot calculate the Shannon entropy of a continuous distribution so this doesn’t make sense. However I see what you’re getting at here—if we assume that all parts of the phase space have equal probability of being visited, then the ‘size’ of the phase space can be taken as proportional to the ‘number’ of microstates (this is studied under ergodic theory). But to make this argument work for actual physical systems where we want to calculate real quantities from theoretical considerations, the phase space must be ‘discretized’ in some way. A very simple way of doing this is the Sackur-Tetrode formulation which discretizes a continuous space based on the Heisenberg uncertainty principle (‘discretize’ is the best word I can come up with here—what I mean is not listing the microstates but instead giving the volume of the phase space in terms of some definite elementary volume). But there’s a catch here. To be able to use the HUP, you have to formulate the phase space in terms of complementary parameters. For instance, momentum+position, or time+energy.
However, this wasn’t how Boltzmann himself envisioned the partitioning of phase space. In his original “counting argument” he partitioned phase space into regions based on the collective properties of the particles themselves, not the external constraints.
My previous point illustrates why this naive view is not physical—you can’t discretize any kind of system. With some systems—like a box full of particles that can have arbitrary position and momentum—you get infinite (non-physical) values for entropy. It’s easy to see why you can now get a fluctuation in entropy—infinity ‘minus’ some number is still infinity!
I tried re-wording this argument several times but I’m still not satisfied with my attempt at explaining it. Nevertheless, this is how it is. Looking at entropy based on models of collective properties of particles may be interesting theoretically but it may not always be a physically realistic way of calculating the entropy of the system. If you go through something like the Sackur-Tetrode way, though, you see that Boltzmann entropy is the same thing as Shannon entropy.
Boltzmann’s original combinatorial argument already presumed a discretization of phase space, derived from a discretization of single-molecule phase space, so we don’t need to incorporate quantum considerations to “fix” it. The combinatorics relies on dividing single-particle state space into tiny discrete boxes, then looking at the number of different ways in which particles could be distributed among those boxes, and observing that there are more ways for the particles to be spread out evenly among the boxes than for them to be clustered. Without discretization the entire argument collapses, since no more than one particle would be able to occupy any particular “box”, so clustering would be impossible.
So Boltzmann did successfully discretize a box full of particles with arbitrary position and momentum, and using his discretization he derived (discrete approximations of) the Maxwell-Boltzmann distribution and the Boltzmann formula for entropy. And he did all this without invoking (or, indeed, being aware of) quantum considerations. So the Sackur-Tetrode route is not a requirement for a discretized Boltzmann-esque argument. I guess you could argue that in the absence of quantum considerations there is no way to justify the discretization, but I don’t see why not. The discretization need not be interpreted as ontological, emerging from the Uncertainty Principle. It could be interpreted as merely epistemological, a reflection of limits to our abilities of observation and intervention.
Incidentally, none of these derivations require the assumption of ergodicity in the system. The result that the size of a macrostate in phase space is proportional to the number of microstates emerges purely from the combinatorics, with no assumptions about the system’s dynamics (other than that they are Hamiltonian). Ergodicity, or something like it, is only required to establish that the time spent by a system in a particular macrostate is proportional to the size of the macrostate, and that is used to justify probabilistic claims about the system, such as the claim that a closed system observed at an arbitrary time is overwhelmingly likely to be in the macrostate of maximum Boltzmann entropy.
So ultimately, I do think the point Carroll was making is valid. The Boltzmann entropy—as in, the actual original quantity defined by Boltzmann and refined by the Ehrenfests, not the modified interpretation proposed by people like Jaynes—is distinct from the Gibbs entropy. The former can increase (or decrease) in closed system, the latter cannot.
To put it slightly more technically, the Gibbs entropy, being a property of a distribution that evolves according to Hamiltonian laws, is bound to stay constant by Liouville’s theorem, unless there is a geometrical change in the accessible phase space or we apply some coarse-graining procedure. Boltzmann entropy, being a property of macrostates, not of distributions, is not bound by Liouville’s theorem. Even if you interpret the Boltzmann entropy as a property of a distribution, it is not a distribution that evolves in a Hamiltonian manner. It evolves discontinuously when the system moves from one Boltzmann macrostate to the next.
But if I know that all the gas molecules are in one half of the container, then I can move a piston for free and then as the gas expands to fill the container again I can extract useful work. It seems like if I know about this increase in order it definitely constitutes a decrease in entropy.
If you know precisely when this increase in order will occur then your knowledge about the system is necessarily very high and your entropy is necessarily very low (probably close to zero) to begin with.
I feel like this may be a semantics issue. I think that order implies information. To me, saying that a system becomes more ordered implies that I know about the increased order somehow. Under that construction, disorder (i.e. the absence of detectable patterns) is a measure of ignorance and disorder then is closely related to entropy. You may be preserving a distinction between the map and territory (i.e. between the system and our knowledge of the system) that I’m neglecting. I’m not sure which framework is more useful/productive.
I think it’s definitely an important distinction to be aware of either way.
‘order’ is not a well-defined concept. One person’s order is another’s chaos. Entropy, on the other hand, is a well-defined concept.
Even though entropy depends on the information you have about the system, the way that it depends on that is not subjective, and any two observers with the same amount of information about the system must come up with the exact same quantity for entropy.
All of this might seem counter-intuitive at first but it makes sense when you realize that Entropy(system) isn’t well-defined, but Entropy(system, model) is precisely defined. The ‘model’ is what Bayesians would call the prior. It is always there, either implicitly or explicitly.
A simple example is that in a closed container filled with gas it’s possible for all the gas molecules to spontaneously move to one side of the container. This temporarily increases the order but has nothing to do with entropy.
I think you’re ignoring the difference between the Boltzmann and Gibbs entropy, both here and in your original comment. This is going to be long, so I apologize in advance.
Gibbs entropy is a property of ensembles, so it doesn’t change when there is a spontaneous fluctuation towards order of the type you describe. As long as the gross constraints on the system remain the same, the ensemble remains the same, so the Gibbs entropy doesn’t change. And it is the Gibbs entropy that is most straightforwardly associated with the Shannon entropy. If you interpret the ensemble as a probability distribution over phase space, then the Gibbs entropy of the ensemble is just the Shannon entropy of the distribution (ignoring some irrelevant and anachronistic constant factors). Everything you’ve said in your comments is perfectly correct, if we’re talking about Gibbs entropy.
Boltzmann entropy, on the other hand, is a property of regions of phase space, not of ensembles or distributions. The famous Boltzmann formula equates entropy with the logarithm of the volume of a region in phase space. Now, it’s true that corresponding to every phase space region there is an ensemble/distribution whose Shannon entropy is identical to the Boltzmann entropy, namely the distribution that is uniform in that region and zero elsewhere. But the converse isn’t true. If you’re given a generic ensemble or distribution over phase space and also some partition of phase space into regions, it need not be the case that the Shannon entropy of the distribution is identical to the Boltzmann entropy of any of the regions.
So I don’t think it’s accurate to say that Boltzmann and Shannon entropy are the same concept. Gibbs and Shannon entropy are the same, yes, but Boltzmann entropy is a less general concept. Even if you interpret Boltzmann entropy as a property of distributions, it is only identical to the Shannon entropy for a subset of possible distributions, those that are uniform in some region and zero elsewhere.
As for the question of whether Boltzmann entropy can decrease spontaneously in a closed system—it really depends on how you partition phase space into Boltzmann macro-states (which are just regions of phase space, as opposed to Gibbs macro-states, which are ensembles). If you define the regions in terms of the gross experimental constraints on the system (e.g. the volume of the container, the external pressure, the external energy function, etc.), then it will indeed be true that the Boltzmann entropy can’t change without some change in the experimental constraints. Trivially true, in fact. As long as the constraints remain constant, the system remains within the same Boltzmann macro-state, and so the Boltzmann entropy must remain the same.
However, this wasn’t how Boltzmann himself envisioned the partitioning of phase space. In his original “counting argument” he partitioned phase space into regions based on the collective properties of the particles themselves, not the external constraints. So from his point of view, the particles all being scrunched up in one corner of the container is not the same macro-state as the particles being uniformly spread throughout the container. It is a macro-state (region) of smaller volume, and therefore of lower Boltzmann entropy. So if you partition phase space in this manner, the entropy of a closed system can decrease spontaneously. It’s just enormously unlikely. It’s worth noting that subsequent work in the Boltzmannian tradition, ranging from the Ehrenfests to Penrose, has more or less adopted Boltzmann’s method of delineating macrostates in terms of the collective properties of the particles, rather than the external constraints on the system.
Boltzmann’s manner of talking about entropy and macro-states seems necessary if you want to talk about the entropy of the universe as a whole increasing, which is something Carroll definitely wants to talk about. The increase in the entropy of the universe is a consequence of spontaneous changes in the configuration of its constituent particles, not a consequence of changing external constraints (unless you count the expansion of the universe, but that is not enough to fully account for the change in entropy on Carroll’s view).
This is going to be a somewhat technical reply, but here goes anyway.
You cannot calculate the Shannon entropy of a continuous distribution so this doesn’t make sense. However I see what you’re getting at here—if we assume that all parts of the phase space have equal probability of being visited, then the ‘size’ of the phase space can be taken as proportional to the ‘number’ of microstates (this is studied under ergodic theory). But to make this argument work for actual physical systems where we want to calculate real quantities from theoretical considerations, the phase space must be ‘discretized’ in some way. A very simple way of doing this is the Sackur-Tetrode formulation which discretizes a continuous space based on the Heisenberg uncertainty principle (‘discretize’ is the best word I can come up with here—what I mean is not listing the microstates but instead giving the volume of the phase space in terms of some definite elementary volume). But there’s a catch here. To be able to use the HUP, you have to formulate the phase space in terms of complementary parameters. For instance, momentum+position, or time+energy.
My previous point illustrates why this naive view is not physical—you can’t discretize any kind of system. With some systems—like a box full of particles that can have arbitrary position and momentum—you get infinite (non-physical) values for entropy. It’s easy to see why you can now get a fluctuation in entropy—infinity ‘minus’ some number is still infinity!
I tried re-wording this argument several times but I’m still not satisfied with my attempt at explaining it. Nevertheless, this is how it is. Looking at entropy based on models of collective properties of particles may be interesting theoretically but it may not always be a physically realistic way of calculating the entropy of the system. If you go through something like the Sackur-Tetrode way, though, you see that Boltzmann entropy is the same thing as Shannon entropy.
Boltzmann’s original combinatorial argument already presumed a discretization of phase space, derived from a discretization of single-molecule phase space, so we don’t need to incorporate quantum considerations to “fix” it. The combinatorics relies on dividing single-particle state space into tiny discrete boxes, then looking at the number of different ways in which particles could be distributed among those boxes, and observing that there are more ways for the particles to be spread out evenly among the boxes than for them to be clustered. Without discretization the entire argument collapses, since no more than one particle would be able to occupy any particular “box”, so clustering would be impossible.
So Boltzmann did successfully discretize a box full of particles with arbitrary position and momentum, and using his discretization he derived (discrete approximations of) the Maxwell-Boltzmann distribution and the Boltzmann formula for entropy. And he did all this without invoking (or, indeed, being aware of) quantum considerations. So the Sackur-Tetrode route is not a requirement for a discretized Boltzmann-esque argument. I guess you could argue that in the absence of quantum considerations there is no way to justify the discretization, but I don’t see why not. The discretization need not be interpreted as ontological, emerging from the Uncertainty Principle. It could be interpreted as merely epistemological, a reflection of limits to our abilities of observation and intervention.
Incidentally, none of these derivations require the assumption of ergodicity in the system. The result that the size of a macrostate in phase space is proportional to the number of microstates emerges purely from the combinatorics, with no assumptions about the system’s dynamics (other than that they are Hamiltonian). Ergodicity, or something like it, is only required to establish that the time spent by a system in a particular macrostate is proportional to the size of the macrostate, and that is used to justify probabilistic claims about the system, such as the claim that a closed system observed at an arbitrary time is overwhelmingly likely to be in the macrostate of maximum Boltzmann entropy.
So ultimately, I do think the point Carroll was making is valid. The Boltzmann entropy—as in, the actual original quantity defined by Boltzmann and refined by the Ehrenfests, not the modified interpretation proposed by people like Jaynes—is distinct from the Gibbs entropy. The former can increase (or decrease) in closed system, the latter cannot.
To put it slightly more technically, the Gibbs entropy, being a property of a distribution that evolves according to Hamiltonian laws, is bound to stay constant by Liouville’s theorem, unless there is a geometrical change in the accessible phase space or we apply some coarse-graining procedure. Boltzmann entropy, being a property of macrostates, not of distributions, is not bound by Liouville’s theorem. Even if you interpret the Boltzmann entropy as a property of a distribution, it is not a distribution that evolves in a Hamiltonian manner. It evolves discontinuously when the system moves from one Boltzmann macrostate to the next.
But if I know that all the gas molecules are in one half of the container, then I can move a piston for free and then as the gas expands to fill the container again I can extract useful work. It seems like if I know about this increase in order it definitely constitutes a decrease in entropy.
If you know precisely when this increase in order will occur then your knowledge about the system is necessarily very high and your entropy is necessarily very low (probably close to zero) to begin with.
I feel like this may be a semantics issue. I think that order implies information. To me, saying that a system becomes more ordered implies that I know about the increased order somehow. Under that construction, disorder (i.e. the absence of detectable patterns) is a measure of ignorance and disorder then is closely related to entropy. You may be preserving a distinction between the map and territory (i.e. between the system and our knowledge of the system) that I’m neglecting. I’m not sure which framework is more useful/productive.
I think it’s definitely an important distinction to be aware of either way.
‘order’ is not a well-defined concept. One person’s order is another’s chaos. Entropy, on the other hand, is a well-defined concept.
Even though entropy depends on the information you have about the system, the way that it depends on that is not subjective, and any two observers with the same amount of information about the system must come up with the exact same quantity for entropy.
All of this might seem counter-intuitive at first but it makes sense when you realize that Entropy(system) isn’t well-defined, but Entropy(system, model) is precisely defined. The ‘model’ is what Bayesians would call the prior. It is always there, either implicitly or explicitly.