Further, If Solomonoff Induction does get these problems right, it does so because closure properties on the class of hypotheses, not because of properties of the way in which the hypotheses are combined.
In the Logical Induction framework, if you add a bunch of other uncomputable hypotheses, you will still get the good properties on the predictable sub-patterns of the environment.
If you start with the Solomonoff Induction framework, this is demonstrably not true: If I have an environment which is 1 on even bits and uncomputable on odd bits, I can add an uncomputable hypothesis that knows all the odd bits. It can gain trust over time, then spend that trust to say that the next even bit has a 90% chance to be a 0. It will take a hit from this prediction, but can earn back the trust from the odd bits and repeat.
Isn’t that a complaint against Bayes, not just Solomonoff? Take any prior P. Take a sequence S whose even bits are all 1 and whose odd bits maximally disagree with P. Take another prior H that exactly knows the odd bits of S but thinks the even bits are random. Now the prior (P+H)/2 will never learn that all even bits of S are 1.
So I’m not quite ready to buy that complaint, because it seems to me that any good method should be at least approximately Bayesian. But maybe I don’t understand enough about what you’re doing...
It is a complaint against Bayes, but it is only a complaint against using Bayes in cases where the real world is probability 0 in your prior.
Part of the point of logical induction is that logic is complicated and no hypothesis in the logical induction algorithm can actually predict it correctly in full, but the algorithm allows for the hypotheses to prove themselves on a sub-pattern, and have the ensemble converge to the correct behavior on that sub-pattern.
Is there a simple “continuous” description of the class of objects that LI belongs to, which shows the point of departure from Bayes without relying on all details of LI? (For example, “it’s like a prior but the result also depends on ordering of input facts”.)
You can generalize LI to arbitrary collections of hypotheses, and interpret it as being about bit sequences rather logic, but not much more than that.
The reason the LI paper talks about the LI criterion rather than a specific algorithm is to push in that direction, but it is not as clean as your example.
I’m not sure I understand the question correctly, but what “LI” actually depends on is, more or less, a collection of traders plus a “prior” over them (although you can’t interpret it as an actual prior since more than one trader can be important in understanding a given environment). Plus there is some ambiguity in the process of choosing fixed points (because there might be multiple fixed points).
Further, If Solomonoff Induction does get these problems right, it does so because closure properties on the class of hypotheses, not because of properties of the way in which the hypotheses are combined.
In the Logical Induction framework, if you add a bunch of other uncomputable hypotheses, you will still get the good properties on the predictable sub-patterns of the environment.
If you start with the Solomonoff Induction framework, this is demonstrably not true: If I have an environment which is 1 on even bits and uncomputable on odd bits, I can add an uncomputable hypothesis that knows all the odd bits. It can gain trust over time, then spend that trust to say that the next even bit has a 90% chance to be a 0. It will take a hit from this prediction, but can earn back the trust from the odd bits and repeat.
Isn’t that a complaint against Bayes, not just Solomonoff? Take any prior P. Take a sequence S whose even bits are all 1 and whose odd bits maximally disagree with P. Take another prior H that exactly knows the odd bits of S but thinks the even bits are random. Now the prior (P+H)/2 will never learn that all even bits of S are 1.
So I’m not quite ready to buy that complaint, because it seems to me that any good method should be at least approximately Bayesian. But maybe I don’t understand enough about what you’re doing...
It is a complaint against Bayes, but it is only a complaint against using Bayes in cases where the real world is probability 0 in your prior.
Part of the point of logical induction is that logic is complicated and no hypothesis in the logical induction algorithm can actually predict it correctly in full, but the algorithm allows for the hypotheses to prove themselves on a sub-pattern, and have the ensemble converge to the correct behavior on that sub-pattern.
Is there a simple “continuous” description of the class of objects that LI belongs to, which shows the point of departure from Bayes without relying on all details of LI? (For example, “it’s like a prior but the result also depends on ordering of input facts”.)
Not really.
You can generalize LI to arbitrary collections of hypotheses, and interpret it as being about bit sequences rather logic, but not much more than that.
The reason the LI paper talks about the LI criterion rather than a specific algorithm is to push in that direction, but it is not as clean as your example.
I’m not sure I understand the question correctly, but what “LI” actually depends on is, more or less, a collection of traders plus a “prior” over them (although you can’t interpret it as an actual prior since more than one trader can be important in understanding a given environment). Plus there is some ambiguity in the process of choosing fixed points (because there might be multiple fixed points).