It occurs to me that Jaynes is missing a desideratum that I might have included. I can’t decide if it’s completely trivial, or if perhaps it’s covered implicitly in his consistency rule 3c; I expect it will become clear as the discussion becomes more formal—and of course, he did promise that the rules given would turn out to be sufficient. To wit:
The robot should not assign plausibilities arbitrarily. If the robot has plausibilities for propositions A and B such that the plausibility of A is independent of the plausibility of B, and the plausibility of A is updated, then the degree of plausibility for B should remain constant barring other updates.
One more thing. The footnote on page 12 wonders: Does it follow that AND and NOT (or NAND alone) are sufficient to write any computer program?
Isn’t this trivial? Since AND and NOT can together be composed to represent any logic function, and a logic function can be interpreted as a function from some number of bits (the truth values of the variable propositions) to one result bit, it follows that we can write programs with AND and NOT that make any bits in our computer an arbitrary function of any of the other bits. Is there some complication I’m missing?
Isn’t this trivial? Since AND and NOT can together be composed to represent any logic function, and a logic function can be interpreted as a function from some number of bits (the truth values of the variable propositions) to one result bit, it follows that we can write programs with AND and NOT that make any bits in our computer an arbitrary function of any of the other bits. Is there some complication I’m missing?
You can use NAND to implement any algorithm that has a finite upper time bound, but not “any computer program”, since a logical formula can’t express recursion.
Electronic NAND gates have a nonzero time delay. This allows you to connect them in cyclic graphs to implement loops.
You can model such a circuit using a set of logical fomulae that has one logical NAND per gate per timestep. Ata pointed out that you need an infinitely large set of logical formulae if you want to model an arbitrarily long computation this way. Though you can compress it back down to a finite description if you’re willing to extend the notation a bit, so you might not consider that a problem.
A = “It’s going to rain tonight in Michigan.”
B = “England will win the World Cup.”
And suppose it thinks that the plausibility of A is 40, and the plausibility of B is 25.
As far as our robot knows, these propositions are not related. That is, in Jaynes’ notation (I’ll use a bang for “not,”) (A|B) = (A|!B) = 40, and (B|A) = (B|!A) = 25. Is that correct?
Now suppose that the plausibility of A jumps to 80, because it’s looking very cloudy this afternoon. I suggest that the plausibility of B should remain unchanged. I’m not sure whether the current set of rules is sufficient to ensure that, although I suspect it is. I think it might be impossible to come up with a consistent system breaking this rule that still obeys the (3c) “consistency over equivalent problems” rule.
If you know from the outset that these propositions are unrelated, you already know something quite important about the logical structure of the world that these propositions describe.
Jaynes comes back to this point over and over again, and it’s also a major theme of the early chapters in Pearl’s Causality:
Probabilistic relationships, such as marginal and conditional independencies, maybe helpful in hypothesizing initial causal structures from uncontrolled observations. However, once knowledge is cast in causal structure, those probabilistic relationships tend to be forgotten; whatever judgements people express about conditional independencies in a given domain are derived from the causal structure acquired. This explains why people feel confident asserting certain conditional independencies (e.g., that the price of beans in China is independent of the traffic in Los Angeles) having no idea whatsoever about the numerical probabilities involved (e.g., whether the price of beans will exceed $10 per bushel).
The way that you phrase this, “suppose the plausibility of A jumps to 80,” has no rigor. Depending on the way you choose to calculate this, it could lead to change in B or not.
if we consider them independent, we could imagine 100 different worlds, and we would expect in 40 of these worlds A to be true, etc., which would leave us with:
10 worlds where AB is true
30 worlds where A(!B) is true
15 worlds where (!A)B is true
45 worlds where (!A)(!B) is true
In general I would expect evidence to come in the form of determining that we are not in a certain world. If we determine that the probability of A rises, because we know ourselves not to be in any world where (!A)(!B) is true, then we would have to adjust the probability of B.
Your given reason, “because it’s looking very cloudy this afternoon.” Would probably indicate that we are uniformly less likely to be in any given world where A is false. In this case, the plausibility of A should jump without effecting the plausibility of B.
So what I’m really saying is that there is no sense in which statements are independent, only a sense in which evidence is independent of statements.
However, a lot of this is speculation since it really isn’t addressed directly in the first chapter, as Christian points out.
I think it is impossible to decide this based on Chapter 1 alone, for the second criterion (qualitative correspondence with common sense) is not yet specified formally.
If you look into Chapter 2, the derivation of the product rule, he uses this rubber-assumption to get the results he aims for (very similarly to you).
I think one should not take some statements of the author like (”… our search for desiderata is at an end… ”) too seriously.
In some sense this informative approach is defensible, from another perspective it definitely looks quite pretentious.
It occurs to me that Jaynes is missing a desideratum that I might have included. I can’t decide if it’s completely trivial, or if perhaps it’s covered implicitly in his consistency rule 3c; I expect it will become clear as the discussion becomes more formal—and of course, he did promise that the rules given would turn out to be sufficient. To wit:
The robot should not assign plausibilities arbitrarily. If the robot has plausibilities for propositions A and B such that the plausibility of A is independent of the plausibility of B, and the plausibility of A is updated, then the degree of plausibility for B should remain constant barring other updates.
One more thing. The footnote on page 12 wonders: Does it follow that AND and NOT (or NAND alone) are sufficient to write any computer program?
Isn’t this trivial? Since AND and NOT can together be composed to represent any logic function, and a logic function can be interpreted as a function from some number of bits (the truth values of the variable propositions) to one result bit, it follows that we can write programs with AND and NOT that make any bits in our computer an arbitrary function of any of the other bits. Is there some complication I’m missing?
(Edited slightly for clarity.)
You can use NAND to implement any algorithm that has a finite upper time bound, but not “any computer program”, since a logical formula can’t express recursion.
Does that mean that digital-eletronic NANDs which could be used to build flip-flops, registers, etc. cannot be expressed in a logical formula?
Electronic NAND gates have a nonzero time delay. This allows you to connect them in cyclic graphs to implement loops.
You can model such a circuit using a set of logical fomulae that has one logical NAND per gate per timestep. Ata pointed out that you need an infinitely large set of logical formulae if you want to model an arbitrarily long computation this way. Though you can compress it back down to a finite description if you’re willing to extend the notation a bit, so you might not consider that a problem.
I agree that you are correct. Thank you.
Not sure I see what you mean. Do you have an example?
I think I was unclear. Here’s what I mean:
Suppose our robot takes these two propositions:
A = “It’s going to rain tonight in Michigan.” B = “England will win the World Cup.”
And suppose it thinks that the plausibility of A is 40, and the plausibility of B is 25.
As far as our robot knows, these propositions are not related. That is, in Jaynes’ notation (I’ll use a bang for “not,”) (A|B) = (A|!B) = 40, and (B|A) = (B|!A) = 25. Is that correct?
Now suppose that the plausibility of A jumps to 80, because it’s looking very cloudy this afternoon. I suggest that the plausibility of B should remain unchanged. I’m not sure whether the current set of rules is sufficient to ensure that, although I suspect it is. I think it might be impossible to come up with a consistent system breaking this rule that still obeys the (3c) “consistency over equivalent problems” rule.
If you know from the outset that these propositions are unrelated, you already know something quite important about the logical structure of the world that these propositions describe.
Jaynes comes back to this point over and over again, and it’s also a major theme of the early chapters in Pearl’s Causality:
-- Pearl, Causality p. 25
The way that you phrase this, “suppose the plausibility of A jumps to 80,” has no rigor. Depending on the way you choose to calculate this, it could lead to change in B or not.
if we consider them independent, we could imagine 100 different worlds, and we would expect in 40 of these worlds A to be true, etc., which would leave us with:
10 worlds where AB is true 30 worlds where A(!B) is true 15 worlds where (!A)B is true 45 worlds where (!A)(!B) is true
In general I would expect evidence to come in the form of determining that we are not in a certain world. If we determine that the probability of A rises, because we know ourselves not to be in any world where (!A)(!B) is true, then we would have to adjust the probability of B.
Your given reason, “because it’s looking very cloudy this afternoon.” Would probably indicate that we are uniformly less likely to be in any given world where A is false. In this case, the plausibility of A should jump without effecting the plausibility of B.
So what I’m really saying is that there is no sense in which statements are independent, only a sense in which evidence is independent of statements.
However, a lot of this is speculation since it really isn’t addressed directly in the first chapter, as Christian points out.
I think it is impossible to decide this based on Chapter 1 alone, for the second criterion (qualitative correspondence with common sense) is not yet specified formally.
If you look into Chapter 2, the derivation of the product rule, he uses this rubber-assumption to get the results he aims for (very similarly to you).
I think one should not take some statements of the author like (”… our search for desiderata is at an end… ”) too seriously.
In some sense this informative approach is defensible, from another perspective it definitely looks quite pretentious.
I don’t understand what you mean by “(B|A) = (B|A’)”.