Isn’t that a complaint against Bayes, not just Solomonoff? Take any prior P. Take a sequence S whose even bits are all 1 and whose odd bits maximally disagree with P. Take another prior H that exactly knows the odd bits of S but thinks the even bits are random. Now the prior (P+H)/2 will never learn that all even bits of S are 1.
So I’m not quite ready to buy that complaint, because it seems to me that any good method should be at least approximately Bayesian. But maybe I don’t understand enough about what you’re doing...
It is a complaint against Bayes, but it is only a complaint against using Bayes in cases where the real world is probability 0 in your prior.
Part of the point of logical induction is that logic is complicated and no hypothesis in the logical induction algorithm can actually predict it correctly in full, but the algorithm allows for the hypotheses to prove themselves on a sub-pattern, and have the ensemble converge to the correct behavior on that sub-pattern.
Is there a simple “continuous” description of the class of objects that LI belongs to, which shows the point of departure from Bayes without relying on all details of LI? (For example, “it’s like a prior but the result also depends on ordering of input facts”.)
You can generalize LI to arbitrary collections of hypotheses, and interpret it as being about bit sequences rather logic, but not much more than that.
The reason the LI paper talks about the LI criterion rather than a specific algorithm is to push in that direction, but it is not as clean as your example.
I’m not sure I understand the question correctly, but what “LI” actually depends on is, more or less, a collection of traders plus a “prior” over them (although you can’t interpret it as an actual prior since more than one trader can be important in understanding a given environment). Plus there is some ambiguity in the process of choosing fixed points (because there might be multiple fixed points).
Isn’t that a complaint against Bayes, not just Solomonoff? Take any prior P. Take a sequence S whose even bits are all 1 and whose odd bits maximally disagree with P. Take another prior H that exactly knows the odd bits of S but thinks the even bits are random. Now the prior (P+H)/2 will never learn that all even bits of S are 1.
So I’m not quite ready to buy that complaint, because it seems to me that any good method should be at least approximately Bayesian. But maybe I don’t understand enough about what you’re doing...
It is a complaint against Bayes, but it is only a complaint against using Bayes in cases where the real world is probability 0 in your prior.
Part of the point of logical induction is that logic is complicated and no hypothesis in the logical induction algorithm can actually predict it correctly in full, but the algorithm allows for the hypotheses to prove themselves on a sub-pattern, and have the ensemble converge to the correct behavior on that sub-pattern.
Is there a simple “continuous” description of the class of objects that LI belongs to, which shows the point of departure from Bayes without relying on all details of LI? (For example, “it’s like a prior but the result also depends on ordering of input facts”.)
Not really.
You can generalize LI to arbitrary collections of hypotheses, and interpret it as being about bit sequences rather logic, but not much more than that.
The reason the LI paper talks about the LI criterion rather than a specific algorithm is to push in that direction, but it is not as clean as your example.
I’m not sure I understand the question correctly, but what “LI” actually depends on is, more or less, a collection of traders plus a “prior” over them (although you can’t interpret it as an actual prior since more than one trader can be important in understanding a given environment). Plus there is some ambiguity in the process of choosing fixed points (because there might be multiple fixed points).