I think the crux of our disagreement [edit: one of our disagreements] is whether the macrostate we’re discussing can be chosen independently of the “uncertainty model” at all.
When physicists talk about “the entropy of a macrostate”, they always mean something of the form:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over all p’s. [EXPECTATION of -log p aka ENTROPY of the distribution]
They never mean something of the form:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over just some of the p’s. [???]
Or:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over just some of the p’s, divided by the sum of p over the same p’s. [CONDITIONAL EXPECTATION of -log p given some event]
Or:
There are a bunch of p’s that add up to 1. We want the sum of (-log p) over just some of the p’s, divided by the number of p’s we included. [ARITHMETIC MEAN of -log p over some event]
This also applies to information theorists talking about Shannon entropy.
I think that’s the basic crux here.
This is perhaps confusing because “macrostate” is often claimed to have something to do with a subset of the microstates. So you might be forgiven for thinking “entropy of a macrostate” in statmech means:
For some arbitrary distribution p, consider a separately-chosen “macrostate” A (a set of outcomes). Compute the sum of p × (-log p) over every p whose corresponding outcome is in A, maybe divided by the total probability of A or something.
But in fact this is not what is meant!
Instead, “entropy of a macrostate” means the following:
For some “macrostate”, whatever the hell that means, we construct a probability distribution p. Maybe that’s the macrostate itself, maybe it’s a distribution corresponding to the macrostate, usage varies. But the macrostate determines the distribution, either way. Compute the sum of p × (-log p) over every p.
EDIT: all of this applies even more to negentropy. The “S_max” in that formula is always the entropy of the highest-entropy possible distribution, not anything to do with a single microstate.
I think the crux of our disagreement [edit: one of our disagreements] is whether the macrostate we’re discussing can be chosen independently of the “uncertainty model” at all.
When physicists talk about “the entropy of a macrostate”, they always mean something of the form:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over all p’s. [EXPECTATION of -log p aka ENTROPY of the distribution]
They never mean something of the form:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over just some of the p’s. [???]
Or:
There are a bunch of p’s that add up to 1. We want the sum of p × (-log p) over just some of the p’s, divided by the sum of p over the same p’s. [CONDITIONAL EXPECTATION of -log p given some event]
Or:
There are a bunch of p’s that add up to 1. We want the sum of (-log p) over just some of the p’s, divided by the number of p’s we included. [ARITHMETIC MEAN of -log p over some event]
This also applies to information theorists talking about Shannon entropy.
I think that’s the basic crux here.
This is perhaps confusing because “macrostate” is often claimed to have something to do with a subset of the microstates. So you might be forgiven for thinking “entropy of a macrostate” in statmech means:
For some arbitrary distribution p, consider a separately-chosen “macrostate” A (a set of outcomes). Compute the sum of p × (-log p) over every p whose corresponding outcome is in A, maybe divided by the total probability of A or something.
But in fact this is not what is meant!
Instead, “entropy of a macrostate” means the following:
For some “macrostate”, whatever the hell that means, we construct a probability distribution p. Maybe that’s the macrostate itself, maybe it’s a distribution corresponding to the macrostate, usage varies. But the macrostate determines the distribution, either way. Compute the sum of p × (-log p) over every p.
EDIT: all of this applies even more to negentropy. The “S_max” in that formula is always the entropy of the highest-entropy possible distribution, not anything to do with a single microstate.