Imagine that we encounter a truly iid random sequence of 90% likely propositions Q(0),Q(1),Q(2),…
Perhaps they are merely pseudorandom but impossibly complicated to reason about, or perhaps they represent some random external output that an agent observes.
After observing a very large number of these Q(i), one might expect to place high probability on something like “About 90% of the next 10^100 Q(j) I haven’t observed yet will be true,” but there is unlikely to be any simple rule that describes the already observed Q(i). Do you think that the next 10^100 Q(j) will all individually be believed 90% likely to be true, or will the simpler to describe Q(j) receive closer to 50% probability?
We can show that the FOL prior is not too different from the algorithmic prior, so it can’t perform too badly for problems where algorithmic induction does well. Partial theories which imply probabilities close to .9 but do not specify exact predictions will eventually have high probability; for example, a theory might specify that Q(x) is derived from an unspecified F(x) and G(x) (treated as random sources) getting OR’d together, making probabilities roughly .75; variations of this would bring things closer to .9.
This still may still assign simpler Q(j) to closer to 50% probability.
Imagine that we encounter a truly iid random sequence of 90% likely propositions Q(0),Q(1),Q(2),… Perhaps they are merely pseudorandom but impossibly complicated to reason about, or perhaps they represent some random external output that an agent observes. After observing a very large number of these Q(i), one might expect to place high probability on something like “About 90% of the next 10^100 Q(j) I haven’t observed yet will be true,” but there is unlikely to be any simple rule that describes the already observed Q(i). Do you think that the next 10^100 Q(j) will all individually be believed 90% likely to be true, or will the simpler to describe Q(j) receive closer to 50% probability?
We can show that the FOL prior is not too different from the algorithmic prior, so it can’t perform too badly for problems where algorithmic induction does well. Partial theories which imply probabilities close to .9 but do not specify exact predictions will eventually have high probability; for example, a theory might specify that Q(x) is derived from an unspecified F(x) and G(x) (treated as random sources) getting OR’d together, making probabilities roughly .75; variations of this would bring things closer to .9.
This still may still assign simpler Q(j) to closer to 50% probability.