gjm comments on An Intuitive Explanation of Solomonoff Induction

gjm 10 Jul 2012 12:26 UTC
3 points
Perhaps I’m being dim, but it doesn’t look as if you answered my question: what differences do you see between “Solomonoff induction” and “Bayesian inference with a 2^-length prior”?

It sounds (but I may be misunderstanding) as if you’re contrasting “choose the shortest program consistent with the data” with “do Bayesian inference with all programs as hypothesis space and the 2^-length prior” in which case you’re presumably proposing that “Solomonoff induction” should mean the former. But—I have now found at least one of Solomonoff’s articles—it’s the latter, not the former, that Solomonoff proposed. (I quote from “A preliminary report on a general theory of inductive inference”, revised version. From the preface: “The main point of the report is Equation (5), Section 11”. From section 12, entitled “An interpretation of equation (5)”: “Equation (5) then enables us to use this model to obtain a priori probabilities to be used in computation of a posteriori probabilities using Bayes’ Theorem.”)

Solomonoff induction is uncomputable and any computable method necessarily fails to find shortest-possible programs; this is well known. If this fact is the basis for your objection to using the term “Solomonoff induction” to denote Bayes-with-length-prior then I’m afraid I don’t understand why; could you explain more?

It’s conventional to assume a prefix-free representation of programs when discussing this sort of formalism. I don’t see that this has anything to do with the distinction (if there is one) between Solomonoff and Bayes-with-length-prior.
- private_messaging 10 Jul 2012 13:51 UTC
  −2 points
  0
  Parent
  Firstly, on what is conventional, it is sadly the case that in the LW conversations observed by me it is not typical that either party recollects the magnitude of the importance of having ALL programs as the hypothesis space, or any other detail, even the detail so important as ‘the output’s beginning is matched against observed data’.
  
  I’m sorry I was thinking in the context whereby the hypothesis space is not complete (note: in the different papers on subjects if i recall correctly the hypothesis can either refer to the final string—the prediction—or to the code). Indeed in the case where it is applied to the complete set of hypotheses, it works fine. It still should be noted though that it is Solomonoff probability (algorithmic probability), that can be used with Bayes, resulting in Solomonoff induction.
  
  It was my mistake with regards to what Solomonoff exactly specified, given that there is a multitude of different equivalent up to constant definitions that end up called same name. By Solomonoff induction I was referring to the one described in the article.
  - gjm 10 Jul 2012 15:26 UTC
    1 point
    Parent
    OK. (My real purpose in posting this comment is to remark, in case you should care, that it’s not I who have been downvoting you in this thread.)
    - private_messaging 10 Jul 2012 15:47 UTC
      −2 points
      0
      Parent
      Thanks for correction in any case. This was genuinely an error on my part. Amusingly it was at +1 while noting that MWI (which someone else brought up) doesn’t begin with the observed data, was much more negatively treated.