I did not go through the 9 remaining cases, but I did think about one…
Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C → ~A).
Re 2-7: Yep, chain rule gets it done. By the way, took me a few minutes to realize that your citation “2-7” refers to a line in the pdf manuscript of the text. The numbering is different in the hardcopy version. In particular, it uses periods (e.g. equation 2.7) instead of dashes (e.g. equation 2-7), so as long as we’re all consistent with that, I don’t suppose there will be much confusion.
Could we standardize on using the whole-book-as-one-PDF version, at least for the purposes of referencing equations?
ETA: So far I’ve benefited from checking the relevant parts of Kevin Van Horn’s unofficial errata pages before (and often while) reading a particular section.
I’m able to follow a fair bit of what’s going on here; the hard portions for me are when Jaynes gets some result without saying which rule or operation justifies it—I suppose it’s obvious to someone familiar with calculus, but when you lack these background assumptions it can be very hard to infer what rules are being used, so I can’t even find out how I might plug the gaps in my knowledge. (Definitely “deadly unk-unk” territory for me.)
(Of course “follow” isn’t the same thing at all as “would be able to get similar results on a different but related problem”. I grok the notion of a functional equation, and I can verify intermediate steps using a symbolic math package, but Jaynes’ overall strategy is obscure to me. Is this a common pattern, taking the derivative of a functional equation then integrating back?)
The next bit where I lose track is 2.22. What’s going on here, is this a total derivative?
Yeah. A total derivative. The way I think about it is the dv thing there (jargon: a differential 1-form) eats a tangent vector in the y-z plane. It spits out the rate of change of the function in the direction of the vector (scaled appropriately with the magnitude of the vector). It does this by looking at the rate of change in the y-direction (the dy stuff) and in the z-direction (the dz stuff) and adding those together (since after taking derivatives, things get nice and linear).
I’m not too familiar with the functional equation business either. I’m currently trying to figure out what the heck is happening on the bottom half of page 32. Figuring out the top half took me a really long while (esp. 2.50).
I’m convinced that the inequality in eqn 2.52 shouldn’t be there. In particular, when you stick in the solution S(x) = 1 - x, it’s false. I can’t figure out if anything below it depends on that because I don’t understand much below it.
I could not figure out why alpha > 0 neither and it seems wrong to me too.
But this does not look like a problem.
We know that J is an increasing function because of 2-49.
So in 2-53, alpha and log(x/S(x)) must have the same sign, since the remaining of the right member tends toward 0 when q tends toward + infinity.
Then b is positive and I think it is all that matters.
However, if alpha = 0, b is not defined. But if alpha=0 then log(x/S(x))=0 as a consequence of 2-53, so x/S(x)=1. There is only one x that gives us this since S is strictly decreasing. And by continuity we can still get 2-56.
K. S. Van Horn gives a few lines describing the derivation in his PT:TLoS errata. I don’t understand why he does step 4 there—it seems to me to be irrelevant. The two main facts which are needed are step 2-3 and step 5, the sum of a geometric series and the Taylor series expansion around y = S(x). Hopefully that is a good hint.
Nitpicking with his errata, 1/(1-z) = 1 + z + O(z^2) for all z is wrong since the interval of convergence for the RHS is (-1,1). This is not important to the problem since the z here will be z = exp(-q) which is less than 1 since q is positive.
It is not very important, but since you mentioned it :
The interval of convergence of the Taylor series of 1/(1-z) at z=0 is indeed (-1,1).
But “1/(1-z) = 1 + z + O(z^2) for all z” does not make sense to me.
1/(1-z) = 1 + z + O(z^2) means that there is an M such as |1/(1-z) - (1 + z)| is no greater that M*z^2 for every z close enough to 0.
It is about the behavior of 1/(1-z) - (1 + z) when z tends toward 0, not when z belongs to (-1,1).
Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C → ~A).
Not sure what you’re getting at. To rule out (AB|C) = F[(A|BC) , (B|AC)], set A = B and let A’s plausibility given C be arbitrary. Let T represent the (fixed) plausibility of a tautology. Then we have
(A|BC) = (B|AC) = T (because A = B) (AB|C) = F(T, T) = constant
But (AB|C) is arbitrary by hypothesis, so (AB|C) = F[(A|BC) , (B|AC)] is not useful.
ETA: Credit where it’s due: page 13, point 4 of Kevin S. Van Horne’s guide to Cox’s theorem (warning: pdf).
Yeah. My solution is basically the same as yours. Setting A=B=C makes F(T,T) = T. But setting A=B AND C → ~A makes F(T,T) = F (warning: unfortunate notation collision here).
Ah OK. You’re right. I guess I was taking the ‘extension of logic’ thing a little too far there. I had it in my head that ({any prop} | {any contradiction}) = T since contradictions imply anything. Thanks.
That’s legit so far as it goes—it’s just that every proposition is also false at the same time, since every proposition’s negation is also true, and the whole enterprise goes to shit. There’s no point in trying to extend logic to uncertain propositions when you can prove anything.
I did not go through the 9 remaining cases, but I did think about one…
Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C → ~A).
Re 2-7: Yep, chain rule gets it done. By the way, took me a few minutes to realize that your citation “2-7” refers to a line in the pdf manuscript of the text. The numbering is different in the hardcopy version. In particular, it uses periods (e.g. equation 2.7) instead of dashes (e.g. equation 2-7), so as long as we’re all consistent with that, I don’t suppose there will be much confusion.
Could we standardize on using the whole-book-as-one-PDF version, at least for the purposes of referencing equations?
ETA: So far I’ve benefited from checking the relevant parts of Kevin Van Horn’s unofficial errata pages before (and often while) reading a particular section.
OK, thanks.
I’m able to follow a fair bit of what’s going on here; the hard portions for me are when Jaynes gets some result without saying which rule or operation justifies it—I suppose it’s obvious to someone familiar with calculus, but when you lack these background assumptions it can be very hard to infer what rules are being used, so I can’t even find out how I might plug the gaps in my knowledge. (Definitely “deadly unk-unk” territory for me.)
(Of course “follow” isn’t the same thing at all as “would be able to get similar results on a different but related problem”. I grok the notion of a functional equation, and I can verify intermediate steps using a symbolic math package, but Jaynes’ overall strategy is obscure to me. Is this a common pattern, taking the derivative of a functional equation then integrating back?)
The next bit where I lose track is 2.22. What’s going on here, is this a total derivative?
Yeah. A total derivative. The way I think about it is the dv thing there (jargon: a differential 1-form) eats a tangent vector in the y-z plane. It spits out the rate of change of the function in the direction of the vector (scaled appropriately with the magnitude of the vector). It does this by looking at the rate of change in the y-direction (the dy stuff) and in the z-direction (the dz stuff) and adding those together (since after taking derivatives, things get nice and linear).
I’m not too familiar with the functional equation business either. I’m currently trying to figure out what the heck is happening on the bottom half of page 32. Figuring out the top half took me a really long while (esp. 2.50).
I’m convinced that the inequality in eqn 2.52 shouldn’t be there. In particular, when you stick in the solution S(x) = 1 - x, it’s false. I can’t figure out if anything below it depends on that because I don’t understand much below it.
I could not figure out why alpha > 0 neither and it seems wrong to me too. But this does not look like a problem.
We know that J is an increasing function because of 2-49. So in 2-53, alpha and log(x/S(x)) must have the same sign, since the remaining of the right member tends toward 0 when q tends toward + infinity.
Then b is positive and I think it is all that matters.
However, if alpha = 0, b is not defined. But if alpha=0 then log(x/S(x))=0 as a consequence of 2-53, so x/S(x)=1. There is only one x that gives us this since S is strictly decreasing. And by continuity we can still get 2-56.
Lovely. Thanks.
I’m totally stuck on getting 2.50 from 2.48, would appreciate a hint.
K. S. Van Horn gives a few lines describing the derivation in his PT:TLoS errata. I don’t understand why he does step 4 there—it seems to me to be irrelevant. The two main facts which are needed are step 2-3 and step 5, the sum of a geometric series and the Taylor series expansion around y = S(x). Hopefully that is a good hint.
Nitpicking with his errata, 1/(1-z) = 1 + z + O(z^2) for all z is wrong since the interval of convergence for the RHS is (-1,1). This is not important to the problem since the z here will be z = exp(-q) which is less than 1 since q is positive.
It is not very important, but since you mentioned it :
The interval of convergence of the Taylor series of 1/(1-z) at z=0 is indeed (-1,1).
But “1/(1-z) = 1 + z + O(z^2) for all z” does not make sense to me.
1/(1-z) = 1 + z + O(z^2) means that there is an M such as |1/(1-z) - (1 + z)| is no greater that M*z^2 for every z close enough to 0. It is about the behavior of 1/(1-z) - (1 + z) when z tends toward 0, not when z belongs to (-1,1).
Is there anything more to getting 2.53 than just rearranging things around? I’m not sure I really understand where we get the left-hand side from.
Indeed, thanks!
Not sure what you’re getting at. To rule out (AB|C) = F[(A|BC) , (B|AC)], set A = B and let A’s plausibility given C be arbitrary. Let T represent the (fixed) plausibility of a tautology. Then we have
(A|BC) = (B|AC) = T (because A = B)
(AB|C) = F(T, T) = constant
But (AB|C) is arbitrary by hypothesis, so (AB|C) = F[(A|BC) , (B|AC)] is not useful.
ETA: Credit where it’s due: page 13, point 4 of Kevin S. Van Horne’s guide to Cox’s theorem (warning: pdf).
Yeah. My solution is basically the same as yours. Setting A=B=C makes F(T,T) = T. But setting A=B AND C → ~A makes F(T,T) = F (warning: unfortunate notation collision here).
Given C → ~A, ({any proposition} | AC) is undefined. That’s why I couldn’t follow your argument all the way.
Ah OK. You’re right. I guess I was taking the ‘extension of logic’ thing a little too far there. I had it in my head that ({any prop} | {any contradiction}) = T since contradictions imply anything. Thanks.
That’s legit so far as it goes—it’s just that every proposition is also false at the same time, since every proposition’s negation is also true, and the whole enterprise goes to shit. There’s no point in trying to extend logic to uncertain propositions when you can prove anything.