Scott Garrabrant comments on Using the universal prior for logical uncertainty (retracted)

Scott Garrabrant Mar 3, 2018, 8:24 PM
5 points
Further, If Solomonoff Induction does get these problems right, it does so because closure properties on the class of hypotheses, not because of properties of the way in which the hypotheses are combined.
In the Logical Induction framework, if you add a bunch of other uncomputable hypotheses, you will still get the good properties on the predictable sub-patterns of the environment.
If you start with the Solomonoff Induction framework, this is demonstrably not true: If I have an environment which is 1 on even bits and uncomputable on odd bits, I can add an uncomputable hypothesis that knows all the odd bits. It can gain trust over time, then spend that trust to say that the next even bit has a 90% chance to be a 0. It will take a hit from this prediction, but can earn back the trust from the odd bits and repeat.
- cousin_it Mar 3, 2018, 10:23 PM
  5 points
  Parent
  Isn’t that a complaint against Bayes, not just Solomonoff? Take any prior P. Take a sequence S whose even bits are all 1 and whose odd bits maximally disagree with P. Take another prior H that exactly knows the odd bits of S but thinks the even bits are random. Now the prior (P+H)/2 will never learn that all even bits of S are 1.
  So I’m not quite ready to buy that complaint, because it seems to me that any good method should be at least approximately Bayesian. But maybe I don’t understand enough about what you’re doing...
  - Scott Garrabrant Mar 3, 2018, 10:34 PM
    6 points
    Parent
    It is a complaint against Bayes, but it is only a complaint against using Bayes in cases where the real world is probability 0 in your prior.
    Part of the point of logical induction is that logic is complicated and no hypothesis in the logical induction algorithm can actually predict it correctly in full, but the algorithm allows for the hypotheses to prove themselves on a sub-pattern, and have the ensemble converge to the correct behavior on that sub-pattern.
    - cousin_it Mar 3, 2018, 11:00 PM
      4 points
      Parent
      Is there a simple “continuous” description of the class of objects that LI belongs to, which shows the point of departure from Bayes without relying on all details of LI? (For example, “it’s like a prior but the result also depends on ordering of input facts”.)
      - Scott Garrabrant Mar 3, 2018, 11:33 PM
        5 points
        Parent
        Not really.
        You can generalize LI to arbitrary collections of hypotheses, and interpret it as being about bit sequences rather logic, but not much more than that.
        The reason the LI paper talks about the LI criterion rather than a specific algorithm is to push in that direction, but it is not as clean as your example.
      - Vanessa Kosoy Mar 4, 2018, 7:00 AM
        1 point
        Parent
        I’m not sure I understand the question correctly, but what “LI” actually depends on is, more or less, a collection of traders plus a “prior” over them (although you can’t interpret it as an actual prior since more than one trader can be important in understanding a given environment). Plus there is some ambiguity in the process of choosing fixed points (because there might be multiple fixed points).