Consider an arbitrary probability distribution P, and the smallest integer (or the lexicographically least object) x such that P(x) < 1/3^^^3 (in Knuth’s up-arrow notation). Since x has a short description, a universal distribution shouldn’t assign it such a low probability, but P does, so P can’t be a universal distribution.
Solomonoff Induction would not consider this a description of x as it cannot be used to compute x.
I guess I failed to bridge the inferential gap here. You’re right “Solomonoff Induction would not consider this a description of x as it cannot be used to compute x.” The point I’m trying to make is that a correct formalization of induction/Occam’s Razor ought (in order to satisfy our intuitions) to assign x some probability > 1/3^^^3 since it has a short (even if uncomputable) description, but a formalization based on this P would not, therefore it can’t be a correct formalization. But this is the case no matter what P we use (with different x depending on P), hence the “unformalizability” problem.
Could you explain further? Why “ought” it assign it such a probability? As stated, this seems more convincing as an argument that it “ought not” to assign a probability > 1/3^^^3, despite the short “description”.
The idea is, however we define P, how can we be that sure that there isn’t some kind of uncomputable physics that would allow someone to build a device that can find the lexicographically least object x such that P(x) < 1/3^^^3 and present it us?
Maybe we should just work off of the assumption that there’s no relevant uncomputable physics, because if there were, we should probably give up our endeavors anyways, unless we knew how to model an uncomputable reality within a computable AGI-program. As Schmidhuber ever so aptly wrote on his homepage:
The distribution should at least be computable in the limit. That is, there should exist a program that takes as an input any beginning of the universe history as well as a next possible event, and produces an output converging on the conditional probability of the event. If there were no such program we could not even formally specify our universe, leave alone writing reasonable scientific papers about it.
unless we knew how to model an uncomputable reality within a computable AGI-program
You could start out by trying to understand how an AI might invent the concept of uncomputability, and how it might then proceed to the possibility of uncomputable physics. And one way to get started here is by thinking as a cognitive historian, and asking how humans came up with the concept of uncomputability.
That gives you a probability inversely proportional to the integral of e^-(the description length) of each number from 0 to infinity. Complexity grows like log(n)-ish. It’s an improper prior.
So I agree with Adele that this may be a modus tollens / modus ponens moment.
If there’s some uncomputable physics that would allow someone to build such a device, we ought to redefine what we mean by computable to include whatever the device outputs. After all, said device falsifies the Church-Turing thesis, which forms the basis for our definition of “computable”.
Yes. ETA: I’m also making the stronger point that any formalization of induction we can come up with would have a similar problem. (Or at least the kinds of formalizations covered by my arguments.)
Solomonoff Induction would not consider this a description of x as it cannot be used to compute x.
I guess I failed to bridge the inferential gap here. You’re right “Solomonoff Induction would not consider this a description of x as it cannot be used to compute x.” The point I’m trying to make is that a correct formalization of induction/Occam’s Razor ought (in order to satisfy our intuitions) to assign x some probability > 1/3^^^3 since it has a short (even if uncomputable) description, but a formalization based on this P would not, therefore it can’t be a correct formalization. But this is the case no matter what P we use (with different x depending on P), hence the “unformalizability” problem.
Could you explain further? Why “ought” it assign it such a probability? As stated, this seems more convincing as an argument that it “ought not” to assign a probability > 1/3^^^3, despite the short “description”.
The idea is, however we define P, how can we be that sure that there isn’t some kind of uncomputable physics that would allow someone to build a device that can find the lexicographically least object x such that P(x) < 1/3^^^3 and present it us?
Maybe we should just work off of the assumption that there’s no relevant uncomputable physics, because if there were, we should probably give up our endeavors anyways, unless we knew how to model an uncomputable reality within a computable AGI-program. As Schmidhuber ever so aptly wrote on his homepage:
You could start out by trying to understand how an AI might invent the concept of uncomputability, and how it might then proceed to the possibility of uncomputable physics. And one way to get started here is by thinking as a cognitive historian, and asking how humans came up with the concept of uncomputability.
“Give up” is neither a well-defined strategy, nor one that was shown to be preferable to its alternatives.
That gives you a probability inversely proportional to the integral of e^-(the description length) of each number from 0 to infinity. Complexity grows like log(n)-ish. It’s an improper prior.
So I agree with Adele that this may be a modus tollens / modus ponens moment.
If there’s some uncomputable physics that would allow someone to build such a device, we ought to redefine what we mean by computable to include whatever the device outputs. After all, said device falsifies the Church-Turing thesis, which forms the basis for our definition of “computable”.
Are you essentially saying Solomonoff Induction could be inferior to some other meta-algorithm (or whatever you call it) in uncomputable universes?
Yes. ETA: I’m also making the stronger point that any formalization of induction we can come up with would have a similar problem. (Or at least the kinds of formalizations covered by my arguments.)
Technically not if P is sufficiently long and complex.