In theories of Bayesianism, the axioms of probability theory are conventionally assumed to say that all logical truths have probability one, and that the probability of a disjunction of logically inconsistent statements is the sum of their probabilities. Corresponding to the second and third Kolmogorov axiom.
If one then e.g. regards the Peano axioms as certain, then all theorems of Peano arithmetic must also be certain, because those are just logical consequences. And all statements which can be disproved in Peano arithmetic then must have probability zero. So the above version of the Kolmogorov axioms is assuming we are logically omniscient. So this form of Bayesianism doesn’t allow us to assign anything like 0.5 probability to the googolth digit of pi being odd: We must assign 1 if it’s odd, or 0 if it’s even.
I think the simple solution is to not talk about logical tautologies and contradictions when expressing the Kolmogorov axioms for a theory of subjective Bayesianism. Instead talk about what we actually know a priori, not about tautologies which we merely could know a priori (if we were logically omniscient). Then the second Kolmogorov axiom says that statements we actually know a priori have to be assigned probability 1, and disjunctions of statements actually known a priori to be mutually exclusive have to be assigned the sum of their probabilities.
Then we are allowed to assign probabilities less than 1 to statements where we don’t actually know that they are tautologies, e.g. 0.5 to “the googolth digit of pi is odd” even if this happens to be, unbeknownst to us, a theorem of Peano arithmetic.
I think the simple solution is to not talk about logical tautologies and contradictions when expressing the Kolmogorov axioms for a theory of subjective Bayesianism. Instead talk about what we actually know a priori, not about tautologies which we merely could know a priori (if we were logically omniscient).
Yes, absolutely. When I apply probability theory it should represent my state of knowledge, not state of knowledge of some logically omniscient being. For me it seems such an obvious thing that I struggle to understand why it’s still not a standard approach.
So are there some hidden paradoxes of such approach that I just do not see yet? Or maybe some issues with formalization of the axioms?
There always is only one correct answer for what outcome from the sample space is actually realised in this particular iteration of the probability experiment.
This doesn’t screw up our update procedure, because probability update represent changes in our knowledge state about which iteration of probability experiment could be this one, not changes in what has actually happened in any particular iteration.
The point is that if you consider all iterations in parallel, you can realize all possible outcomes of the sample space, and assign a probability to each outcome occurring for a Bayesian superintelligence, while in a consistent proof system, not all possible outcomes/statements can be proved, no matter how many iterations are done, and if you could do this, you have proved the logic/theory inconsistent, which is the problem, because for logical uncertainty, there is only 1 possible outcome no matter the amount of iterations for searching for a proof/disproof of a statement (for consistent logics. If not the logic can prove everything)
This is what makes logical uncertainty non-Bayesian, and is why Bayesian reasoning assumes logical omniscience, so this pathological outcome doesn’t happen, but as a consequence, you have basically trivialized learning/intelligence.
The point is that if you consider all iterations in parallel, you can realize all possible outcomes of the sample space
Likewise if I consider every digit of pi in parallel, some of them are odd and some of them are even.
and assign a probability to each outcome occurring for a Bayesian superintelligence
And likewise I can assign probabilities based on how often an unknown to me digit of pi is even or odd. Not sure what does a superintelligence has to do here.
while in a consistent proof system, not all possible outcomes/statements can be proved
The same applies to a coin toss. I can’t prove both “This particular coin toss is Heads” and “This particular coin toss is Tails”, no more than I can simultaneously prove both “This particular digit of pi is odd” and “This particular digit of pi is even”
because for logical uncertainty, there is only 1 possible outcome no matter the amount of iterations
You just need to define you probability experiment more broadly, talking about not a particular digit of pi but a random one, the same way we are doing it for a toss of the coin.
In theories of Bayesianism, the axioms of probability theory are conventionally assumed to say that all logical truths have probability one, and that the probability of a disjunction of logically inconsistent statements is the sum of their probabilities. Corresponding to the second and third Kolmogorov axiom.
If one then e.g. regards the Peano axioms as certain, then all theorems of Peano arithmetic must also be certain, because those are just logical consequences. And all statements which can be disproved in Peano arithmetic then must have probability zero. So the above version of the Kolmogorov axioms is assuming we are logically omniscient. So this form of Bayesianism doesn’t allow us to assign anything like 0.5 probability to the googolth digit of pi being odd: We must assign 1 if it’s odd, or 0 if it’s even.
I think the simple solution is to not talk about logical tautologies and contradictions when expressing the Kolmogorov axioms for a theory of subjective Bayesianism. Instead talk about what we actually know a priori, not about tautologies which we merely could know a priori (if we were logically omniscient). Then the second Kolmogorov axiom says that statements we actually know a priori have to be assigned probability 1, and disjunctions of statements actually known a priori to be mutually exclusive have to be assigned the sum of their probabilities.
Then we are allowed to assign probabilities less than 1 to statements where we don’t actually know that they are tautologies, e.g. 0.5 to “the googolth digit of pi is odd” even if this happens to be, unbeknownst to us, a theorem of Peano arithmetic.
Yes, absolutely. When I apply probability theory it should represent my state of knowledge, not state of knowledge of some logically omniscient being. For me it seems such an obvious thing that I struggle to understand why it’s still not a standard approach.
So are there some hidden paradoxes of such approach that I just do not see yet? Or maybe some issues with formalization of the axioms?
Basically, because it screws with update procedures, since formally speaking, only 1 answer is correct, and quetzal rainbow pointed this out:
https://www.lesswrong.com/posts/H229aGt8nMFQsxJMq/what-s-the-deal-with-logical-uncertainty#yHC8EuR76FE3tnuk6
There always is only one correct answer for what outcome from the sample space is actually realised in this particular iteration of the probability experiment.
This doesn’t screw up our update procedure, because probability update represent changes in our knowledge state about which iteration of probability experiment could be this one, not changes in what has actually happened in any particular iteration.
The point is that if you consider all iterations in parallel, you can realize all possible outcomes of the sample space, and assign a probability to each outcome occurring for a Bayesian superintelligence, while in a consistent proof system, not all possible outcomes/statements can be proved, no matter how many iterations are done, and if you could do this, you have proved the logic/theory inconsistent, which is the problem, because for logical uncertainty, there is only 1 possible outcome no matter the amount of iterations for searching for a proof/disproof of a statement (for consistent logics. If not the logic can prove everything)
This is what makes logical uncertainty non-Bayesian, and is why Bayesian reasoning assumes logical omniscience, so this pathological outcome doesn’t happen, but as a consequence, you have basically trivialized learning/intelligence.
Likewise if I consider every digit of pi in parallel, some of them are odd and some of them are even.
And likewise I can assign probabilities based on how often an unknown to me digit of pi is even or odd. Not sure what does a superintelligence has to do here.
The same applies to a coin toss. I can’t prove both “This particular coin toss is Heads” and “This particular coin toss is Tails”, no more than I can simultaneously prove both “This particular digit of pi is odd” and “This particular digit of pi is even”
You just need to define you probability experiment more broadly, talking about not a particular digit of pi but a random one, the same way we are doing it for a toss of the coin.
Yeah, I think it’s that one