pivo comments on Solomonoff induction on a random string

pivo 10 Apr 2014 19:56 UTC
3 points
For posterity, the convention is to call the two models Universal/Solomonoff prior M and Universal/Levin mixture ξ, respectively.
- cousin_it 11 Apr 2014 7:08 UTC
  2 points
  Parent
  I’m not sure why we need to make that distinction. The Solomonoff and Levin constructions are equivalent. The prior built from all deterministic programs that output bit strings, and the prior built from all computable probability distributions, turn out to be the same prior. See e.g. here for proofs and references.
  What links here?
  - cousin_it's comment on More and Less than Solomonoff Induction by Manfred (21 May 2014 14:19 UTC; 1 point)
  - pivo 11 Apr 2014 8:16 UTC
    1 point
    Parent
    They’re equivalent from the point of view of a consumer of the prediction, they’re not equivalent from the point of view of an implementation. And since this is a discussion about how does it work, the distinction is useful.
  - MrMind 11 Apr 2014 8:56 UTC
    0 points
    Parent
    Then I’m confused, because the two would seem to produce two very different answers on the same string.
    Since a string with very high Kolmogorov complexity can be clearly produced by a uniform distribution, the Solomonoff prior would converge to a very high complexity hypothesis, while the Levin mixture would just assign 0.5 to 0 and 0.5 to 1.
    What am I missing here?
    - cousin_it 11 Apr 2014 12:23 UTC
      4 points
      Parent
      The Solomonoff prior would have many surviving hypotheses at each step, and the total weight of those that predict a 0 for the next bit would be about equal to the total weight of those that predict a 1. If the input distribution is biased, e.g. 0 with probability ⁵⁄₆ and 1 with probability ¹⁄₆, then the Solomonoff prior will converge on that as well. That works for any computable input distribution, with probability 1 according to the input distribution.
      - V_V 11 Apr 2014 13:10 UTC
        1 point
        Parent
        nitpick: the prior does not converge, the prior is what you have before you start observing data, then it is a posterior.
      - MrMind 11 Apr 2014 12:39 UTC
        1 point
        Parent
        Many thanks, I get it now.
    - V_V 11 Apr 2014 10:06 UTC
      3 points
      Parent
      What matters is the probability that they assign to the next bit being equal to one.