A statement, any statement, starts out with a 50% probability of being true, and then you adjust that percentage based on the evidence you come into contact with.
Zed, you have earned an upvote (and several more mental ones) from me for this display of understanding on a level of abstraction even beyond what some LW readers are comfortable with, as witnessed by other comments. How prescient indeed was Bayesian Bob’s remark:
(I shouldn’t have said that 50% part. There’s no way that’s going to go over well. I’m such an idiot.)
You can be assured that poor Rational Rian has no chance when even Less Wrong has trouble!
But yes, this is of course completely correct. 50% is the probability of total ignorance—including ignorance of how many possibilities are in the hypothesis space. Probability measures how much information you have, and 50% represents a “score” of zero. (How do you calculate the “score”, you ask? It’s the logarithm of the odds ratio. Why should that be chosen as the score? Because it makes updating additive: when you see evidence, you update your score by adding to it the number of bits of evidence you see.)
Of course, we almost never reach this level of ignorance in practice, which makes this the type of abstract academic point that people all-too-characteristically have trouble with. The step of calculating the complexity of a hypothesis seems “automatic”, so much so that it’s easy to forget that there is a step there.
If P is the probability that an ideal Bayesian would assign to a proposition A on hearing A but having observed no relevant evidence, then you have described the meta expected value of P in logical ignorance before doing any calculations (and assuming an ignorance prior on the distribution of propositions one might hear about). It seems to me that you have made excessively harsh criticism against those who have made correct statements about P itself.
[Y]ou have described the meta expected value of P...It seems to me that you have made excessively harsh criticism against those who have made correct statements about P itself.
See my other comments. In my opinion, the correct point of view is that P is a variable (or, if you prefer, a two-argument function); the “correct” statements are about a different value of P from the relevant one (resp. depend on inappropriately fixing one of the two arguments).
EDIT: Also, I think this is the level on which Bayesian Bob was thinking, and the critical comments weren’t taking this into account and were assuming a basic error was being made (just like Rational Rian).
Of course, we almost never reach this level of ignorance in practice,
I think this is actually too weak. Hypothesis specification of any kind requires some kind of working model/theory/map of the external world. Otherwise the hypothesis doesn’t have semantic content. And once you have that model some not totally ignorant prior will fall out. You’re right that 50% is the probability of total ignorance, but this is something of a conceptual constant that falls out of the math—you can’t actually specify a hypothesis with such little information.
You’re right that 50% is the probability of total ignorance, but this is something of a conceptual constant that falls out of the math
Yes, that’s exactly right! It is a conceptual constant that falls out of the math. It’s purely a formality. Integrating this into your conceptual scheme is good for the versatility of your conceptual scheme, but not for much else—until, later, greater versatility proves to be important.
People have a great deal of trouble accepting formalities that do not appear to have concrete practical relevance. This is why it took so long for the numbers 0 and 1 to be accepted as numbers.
I disagree with this bit. It’s only purely a formality when you consider a single hypothesis, but when you consider a hypothesis that is comprised of several parts, each of which uses the prior of total ignorance, then the 0.5 prior probability shows up in the real math (that in turn affects the decisions you make).
If you think that the concept of the universal prior of total ignorance is purely a formality, i.e. something that can never affect the decisions you make, then I’d be very interested in your thoughts behind that.
A statement, any statement, starts out with a 50% probability of being true, and then you adjust that percentage based on the evidence you come into contact with.
Is it not propositions that can only be true or false, while statements can be other things?
As I see it, statements start with some probability of being true propositions, some probability of being false propositions, and some probability of being neither. So a statement about which I have no information, say a random statement to which a random number generator was designed to preface with “Not” half the time, has a less than 50% chance of being true.
This speaks to the intuition that statements fail to be true most of the time. “A proposition, any proposition, starts out with a 50% probability of being true” is only true assuming the given statement is a proposition, and I think knowing that an actual statement is a proposition entails being contaminated by knowledge about the proposition’s contents.
As I see it, statements start with some probability of being true propositions, some probability of being false propositions, and some probability of being neither.
Okay. So “a statement, any statement, is as likely to be true as false (under total ignorance)” would be more accurate. The odds ratio remains the same.
The intuition that statements fail to be true most of the time is wrong, however. Because, trivially, for every statement that is true its negation is false and for every statement that is false its negation is true. (Statements that have no negation are neither true nor false)
It’s just that (interesting) statements in practice tend to be positive claims (about the world), and it’s much harder to make a true positive claim about the world than a true negative one. This is why a long (measured in Kolmogorov complexity) positive claim is very unlikely to be true and a long negative claim (Kolmogorov complexity) is very likely to be true. Also, it’s why a long conjunction of terms is unlikely to be true and a long disjunction of terms is likely to be true. Again, symmetry.
S -> statements
P -> propositions
N -> non-propositional statements
T -> true propositions
F -> false propositions
I don’t agree with condition S = ~T + T.
Because ~T + T is what you would call the set of (true and false) propositions, and I have readily accepted the existence of statements which are neither true nor false. That’s N. So you get S = ~T + T + N = T + F + N = P + N
We can just taboo proposition and statement as proposed by komponisto. If you agree with the way he phrased it in terms of hypothesis then we’re also in agreement (by transitivity of agreement :)
(This may be redundant, but if your point is that the set of non-true statements is larger than the set of false propositions, then yes, of course, I agree with that. I still don’t think the distinction between statement and proposition is that relevant to the underlying point because the odds ratio is not affected by the inclusion or exclusion of non-propositional statements)
Zed, you have earned an upvote (and several more mental ones) from me for this display of understanding on a level of abstraction even beyond what some LW readers are comfortable with, as witnessed by other comments. How prescient indeed was Bayesian Bob’s remark:
You can be assured that poor Rational Rian has no chance when even Less Wrong has trouble!
But yes, this is of course completely correct. 50% is the probability of total ignorance—including ignorance of how many possibilities are in the hypothesis space. Probability measures how much information you have, and 50% represents a “score” of zero. (How do you calculate the “score”, you ask? It’s the logarithm of the odds ratio. Why should that be chosen as the score? Because it makes updating additive: when you see evidence, you update your score by adding to it the number of bits of evidence you see.)
Of course, we almost never reach this level of ignorance in practice, which makes this the type of abstract academic point that people all-too-characteristically have trouble with. The step of calculating the complexity of a hypothesis seems “automatic”, so much so that it’s easy to forget that there is a step there.
If P is the probability that an ideal Bayesian would assign to a proposition A on hearing A but having observed no relevant evidence, then you have described the meta expected value of P in logical ignorance before doing any calculations (and assuming an ignorance prior on the distribution of propositions one might hear about). It seems to me that you have made excessively harsh criticism against those who have made correct statements about P itself.
See my other comments. In my opinion, the correct point of view is that P is a variable (or, if you prefer, a two-argument function); the “correct” statements are about a different value of P from the relevant one (resp. depend on inappropriately fixing one of the two arguments).
EDIT: Also, I think this is the level on which Bayesian Bob was thinking, and the critical comments weren’t taking this into account and were assuming a basic error was being made (just like Rational Rian).
I think this is actually too weak. Hypothesis specification of any kind requires some kind of working model/theory/map of the external world. Otherwise the hypothesis doesn’t have semantic content. And once you have that model some not totally ignorant prior will fall out. You’re right that 50% is the probability of total ignorance, but this is something of a conceptual constant that falls out of the math—you can’t actually specify a hypothesis with such little information.
Yes, that’s exactly right! It is a conceptual constant that falls out of the math. It’s purely a formality. Integrating this into your conceptual scheme is good for the versatility of your conceptual scheme, but not for much else—until, later, greater versatility proves to be important.
People have a great deal of trouble accepting formalities that do not appear to have concrete practical relevance. This is why it took so long for the numbers 0 and 1 to be accepted as numbers.
I disagree with this bit. It’s only purely a formality when you consider a single hypothesis, but when you consider a hypothesis that is comprised of several parts, each of which uses the prior of total ignorance, then the 0.5 prior probability shows up in the real math (that in turn affects the decisions you make).
I describe an example of this here: http://lesswrong.com/r/discussion/lw/73g/take_heed_for_it_is_a_trap/4nl8?context=1#4nl8
If you think that the concept of the universal prior of total ignorance is purely a formality, i.e. something that can never affect the decisions you make, then I’d be very interested in your thoughts behind that.
Is it not propositions that can only be true or false, while statements can be other things?
What’s the relevance of this question? Is there a reason “statement” shouldn’t be interpreted as “proposition” in the above?
As I see it, statements start with some probability of being true propositions, some probability of being false propositions, and some probability of being neither. So a statement about which I have no information, say a random statement to which a random number generator was designed to preface with “Not” half the time, has a less than 50% chance of being true.
This speaks to the intuition that statements fail to be true most of the time. “A proposition, any proposition, starts out with a 50% probability of being true” is only true assuming the given statement is a proposition, and I think knowing that an actual statement is a proposition entails being contaminated by knowledge about the proposition’s contents.
Okay. So “a statement, any statement, is as likely to be true as false (under total ignorance)” would be more accurate. The odds ratio remains the same.
The intuition that statements fail to be true most of the time is wrong, however. Because, trivially, for every statement that is true its negation is false and for every statement that is false its negation is true. (Statements that have no negation are neither true nor false)
It’s just that (interesting) statements in practice tend to be positive claims (about the world), and it’s much harder to make a true positive claim about the world than a true negative one. This is why a long (measured in Kolmogorov complexity) positive claim is very unlikely to be true and a long negative claim (Kolmogorov complexity) is very likely to be true. Also, it’s why a long conjunction of terms is unlikely to be true and a long disjunction of terms is likely to be true. Again, symmetry.
S=P+N
P=T+F
T=F
S=~T+T
N>0
~~~
~T+T=P+N
~T+T=T+F+N
~T=F+N
~T=T+N
~T>T
Legend:
I don’t agree with condition S = ~T + T.
Because ~T + T is what you would call the set of (true and false) propositions, and I have readily accepted the existence of statements which are neither true nor false. That’s N. So you get S = ~T + T + N = T + F + N = P + N
We can just taboo proposition and statement as proposed by komponisto. If you agree with the way he phrased it in terms of hypothesis then we’re also in agreement (by transitivity of agreement :)
(This may be redundant, but if your point is that the set of non-true statements is larger than the set of false propositions, then yes, of course, I agree with that. I still don’t think the distinction between statement and proposition is that relevant to the underlying point because the odds ratio is not affected by the inclusion or exclusion of non-propositional statements)