I liked this paper and summary, and was able to follow most of it except for the actual physics :)
I feel like I missed something important though:
If we are trying to judge P(B|A,I|), what’s the use of knowing the entropy of state A? The thrust I got was “Give weight to possible B in accordance with their entropy, and somehow constrain that with info from A,I”, but I didn’t get a sense of what using A,I as constraints looked like (I expect that it would make more sense if I could do the physics examples).
I think this is captured in Section 5, the Maximum Caliber Principle:
We are given macroscopic information A which might consist of values of several physical quantities . . . such as distribution of stress, magnetization, concentration of various chemical components, etc. in various space time regions. This defines a caliber . . . which measures the number of time dependent microstates consistent with the information A.
So the idea is that you take the macro information A, use that to identify the space of possible microstates. For maximum rigor you do this independently for A and B, and if they do not share any microstates then B is impossible. When we make a prediction about B, we choose the value of B that has the biggest overlap with the possible microstates of A.
He talks a little bit more about the motivation for doing this in the Conclusion, here:
We should correct a possible misconception that the reader may have gained. Most recent discussions of macrophenomena outside of physical chemistry concentrate entirely on the dynamics (microscopic equations of motion or an assumed dynamical model at a higher level, deterministic or stochastic) and ignore the entropy factors of macrostates altogether. Indeed, we expect that such efforts will succeed fairly well if the macrostates of interest do not differ greatly in entropy.
Emphasis mine. So the idea here is that if you don’t need to account for the entropy of A, you will be able to tackle the problem using normal methods. If the normal methods fail, it’s a sign that we need to account for the entropy of A, and therefore to use this method.
I can’t do the physics examples either except in very simple cases. I am comforted by this line:
Although the mathematical details needed to carry it out can become almost infinitely complicated...
Thanks! In my head, I was using the model of “flip 100 coins, exact value of all coins is micro states, heads-tails count is macro state”. In that model, the macro states form disjoint sets, so it’s probably not a good example.
I think I get your point in abstract, but I’m struggling to form an example model that fits it. Any suggestions?
Apologies for this being late; I also struggled to come up with an example model. Checking the references, he talks about A more thoroughly in the paper where the idea was originally presented.
I strongly recommend taking a look at page 5 of the PDF, which is where he starts a two page section clarifying the meaning of entropy in this context. I think this will help a lot...once I figure it out.
I liked this paper and summary, and was able to follow most of it except for the actual physics :)
I feel like I missed something important though:
If we are trying to judge P(B|A,I|), what’s the use of knowing the entropy of state A? The thrust I got was “Give weight to possible B in accordance with their entropy, and somehow constrain that with info from A,I”, but I didn’t get a sense of what using A,I as constraints looked like (I expect that it would make more sense if I could do the physics examples).
I think this is captured in Section 5, the Maximum Caliber Principle:
So the idea is that you take the macro information A, use that to identify the space of possible microstates. For maximum rigor you do this independently for A and B, and if they do not share any microstates then B is impossible. When we make a prediction about B, we choose the value of B that has the biggest overlap with the possible microstates of A.
He talks a little bit more about the motivation for doing this in the Conclusion, here:
Emphasis mine. So the idea here is that if you don’t need to account for the entropy of A, you will be able to tackle the problem using normal methods. If the normal methods fail, it’s a sign that we need to account for the entropy of A, and therefore to use this method.
I can’t do the physics examples either except in very simple cases. I am comforted by this line:
Thanks! In my head, I was using the model of “flip 100 coins, exact value of all coins is micro states, heads-tails count is macro state”. In that model, the macro states form disjoint sets, so it’s probably not a good example.
I think I get your point in abstract, but I’m struggling to form an example model that fits it. Any suggestions?
Apologies for this being late; I also struggled to come up with an example model. Checking the references, he talks about A more thoroughly in the paper where the idea was originally presented.
I strongly recommend taking a look at page 5 of the PDF, which is where he starts a two page section clarifying the meaning of entropy in this context. I think this will help a lot...once I figure it out.