gjm comments on Gelman Against Parsimony

gjm 26 Nov 2013 20:22 UTC
2 points

My impression is that Solomonoff induction starts by assuming the Occam’s Razor.

The fact that it buys you something interesting without making that assumption was the whole point of the paragraph you were commenting on.

That’s not a problem—all simple hypotheses can be just as improbable.

I don’t believe that is true. Perhaps I’ve been insufficiently clear by trying to be brief (the difficulty being that “very complex” is really shorthand for something involving a limiting process), so let me be less brief.

First: Suppose you have a list of mutually exclusive hypotheses H1, H2, etc., with probabilities p1, p2, etc. List them in increasing order of complexity. Then the sum of all the pj is finite, and therefore as j → infinity pj → zero. Hence, “very complex hypotheses (in this list) have to be very improbable” in the following sense: for any probability p, however small, there’s a level of complexity C such that every hypothesis from your list whose complexity is at least C has probability smaller than p.

That doesn’t quite mean that very complex hypotheses have to be improbable. Indeed, you can construct very complex high-probability hypotheses as very long disjunctions. And since p and ~p have about the same complexity for any p, it must in some sense be true that about as many very complex propositions have high probabilities as have low probabilities. (So what I said certainly wasn’t quite right.)

However, I bet something along the following lines is true. Suppose you have a probability distribution over propositions (this is for generating them, and isn’t meant to have anything directly to do with the probability that each proposition is true), and suppose we also assign all the propositions probabilities in a way consistent with the laws of probability theory. (I’m assuming here that our class of propositions is closed under the usual logical operations.) And suppose we also assign all the propositions complexities in any reasonable way. Define the essential complexity of a proposition to be the infimum of the complexities of propositions that imply it. (I’m pretty sure it’s always attained.) Then I conjecture that something like this is both true and fairly easy to prove: for any fixed probability level q, as C → oo, if you generate a proposition at random (according to the “generating” distribution) conditional on its essential complexity being at least C, then Pr(its probability >= q) tends to 0.
- Lumifer 26 Nov 2013 21:15 UTC
  1 point
  Parent
  Sorry, will put this on hold for a bit—it requires some thinking and I don’t have time for it at the moment...
  - gjm 26 Nov 2013 21:59 UTC
    1 point
    Parent
    No problem!