Hm. Yeah, you’re right. The probability goes like 1/(1+2^number of things implied). You’d think that “3 is prime, 5 is prime, 7 is prime” would at least make the robot guess that all odd numbers were prime.
On the other hand, if we think of all possible statements “for all X, Q(X)”, there are more was to be false than true. Infinity more ways, even, and this is directly related to how many things are implied by the statement. So in that way, the robot is functioning exactly as intended and assigning the maximum entropy value.
Which makes this problem sort of like my suspicions about magfrump’s idea below—related to the fact that we expect simple things, we don’t actually expect maximum entropy—and we’re often right.
I thought of it some more… Don’t think of probability as a way to represent ignorance. Probability represents a kind of knowledge. For example when you say “probability that coin fell heads is 1⁄2 ”, that represents a sort of a model of what bounces do to the coin direction. When you multiply probabilities for a pair of coin throws, that represents independence of the throws. The 1⁄2 for Q(i) is not representative of that, and even if it was representative of something (statistics on the Q), product of these numbers is not probability but lower bound on the probability assuming independence; assuming they aren’t independent, the probability is still 1⁄2 .
Probabilistic methods certainly are very useful in time bounded calculations—they are used very heavily in computer graphics and simulations of all kinds—but that is usually through use of random number source or sufficiently good random number generator, as in the probabilistic primality tests such as Miller-Rabin for example. The time-bounded bot, far from evaluating and updating “probabilities”, would have to do symbolic mathematics searching for an imperfect solution, and make use of random number sources whenever appropriate, striving for bounds on probability of being wrong. (e.g. when I implemented Miller-Rabin I just made it run enough calculations so that it wouldn’t fail till heat death of universe assuming that my prng wouldn’t somehow return a huge string of bad witnesses, which would be very interesting if it happened).
Well, that doesn’t seem very useful as per Wei Dai example...
Why not look at the methods people actually employ to approximate and guess math when they can’t quite compute something? Applied mathematics is an enormously huge field.
On the other hand, if we think of all possible statements “for all X, Q(X)”, there are more was to be false than true. Infinity more ways, even,
If you consider all syntactically valid Q in some sort of math notation, no matter the length, there’s the fraction that is simply 0=0*( ….....… X somewhere here........) , or your favourite less trivial statement of choice. Ditto for Turing machine tapes et cetera. There’s certainly a nonzero probability of constructing Q(X) that holds for all X.
That is a more detailed model than the robot uses. What it means by the ways to be false or true is more like
“true for 1, true for 2, true for 3, false for 4, true for 5...” The robot can’t look inside the statements while it’s doing probabilistic logic, it can only look at truth values and relationships.
On the other hand, the power of doing that is certainly a good reason to upgrade the robot :)
Hm. Yeah, you’re right. The probability goes like 1/(1+2^number of things implied). You’d think that “3 is prime, 5 is prime, 7 is prime” would at least make the robot guess that all odd numbers were prime.
On the other hand, if we think of all possible statements “for all X, Q(X)”, there are more was to be false than true. Infinity more ways, even, and this is directly related to how many things are implied by the statement. So in that way, the robot is functioning exactly as intended and assigning the maximum entropy value.
Which makes this problem sort of like my suspicions about magfrump’s idea below—related to the fact that we expect simple things, we don’t actually expect maximum entropy—and we’re often right.
I thought of it some more… Don’t think of probability as a way to represent ignorance. Probability represents a kind of knowledge. For example when you say “probability that coin fell heads is 1⁄2 ”, that represents a sort of a model of what bounces do to the coin direction. When you multiply probabilities for a pair of coin throws, that represents independence of the throws. The 1⁄2 for Q(i) is not representative of that, and even if it was representative of something (statistics on the Q), product of these numbers is not probability but lower bound on the probability assuming independence; assuming they aren’t independent, the probability is still 1⁄2 .
Probabilistic methods certainly are very useful in time bounded calculations—they are used very heavily in computer graphics and simulations of all kinds—but that is usually through use of random number source or sufficiently good random number generator, as in the probabilistic primality tests such as Miller-Rabin for example. The time-bounded bot, far from evaluating and updating “probabilities”, would have to do symbolic mathematics searching for an imperfect solution, and make use of random number sources whenever appropriate, striving for bounds on probability of being wrong. (e.g. when I implemented Miller-Rabin I just made it run enough calculations so that it wouldn’t fail till heat death of universe assuming that my prng wouldn’t somehow return a huge string of bad witnesses, which would be very interesting if it happened).
I’ll be specific. Shannon entropy represents ignorance. The bigger it is, the more ignorant you are.
Well, that doesn’t seem very useful as per Wei Dai example...
Why not look at the methods people actually employ to approximate and guess math when they can’t quite compute something? Applied mathematics is an enormously huge field.
If you consider all syntactically valid Q in some sort of math notation, no matter the length, there’s the fraction that is simply 0=0*( ….....… X somewhere here........) , or your favourite less trivial statement of choice. Ditto for Turing machine tapes et cetera. There’s certainly a nonzero probability of constructing Q(X) that holds for all X.
That is a more detailed model than the robot uses. What it means by the ways to be false or true is more like
“true for 1, true for 2, true for 3, false for 4, true for 5...” The robot can’t look inside the statements while it’s doing probabilistic logic, it can only look at truth values and relationships.
On the other hand, the power of doing that is certainly a good reason to upgrade the robot :)