Even Wikipedia notes that Cox’s Theorem makes another approach possible—that seems like the place to start looking if you want a mathematical proof. So I think Larks came close to the right question (though it may or may not address your concerns).
Cox and Jaynes show that we can start by requiring probability or the logic of uncertainty to have certain features. For example, our calculations should have a type of consistency such that it shouldn’t matter to our final answer if we write P(A∩B) or P(B∩A). This, together with the other requirements, ultimately tells us that:
P(A∩B) = P(B)P(A|B) = P(A)P(B|A)
Which immediately gives us a possible justification for both the Kolmogorov definition and Bayes’ Theorem.
Even Wikipedia notes that Cox’s Theorem makes another approach possible—that seems like the place to start looking if you want a mathematical proof. So I think Larks came close to the right question (though it may or may not address your concerns).
Cox and Jaynes show that we can start by requiring probability or the logic of uncertainty to have certain features. For example, our calculations should have a type of consistency such that it shouldn’t matter to our final answer if we write P(A∩B) or P(B∩A). This, together with the other requirements, ultimately tells us that:
P(A∩B) = P(B)P(A|B) = P(A)P(B|A)
Which immediately gives us a possible justification for both the Kolmogorov definition and Bayes’ Theorem.