Here are two big weaknesses of Bayesian epistemology as I understand it:
it cannot handle uncertainty about unproved mathematical truths.
Why do you think that?
It does not describe the way any existing intelligence actually operates, or even could operate in principle. (That last clause is the problem of writing an AI.)
Solomonoff induction is uncomputable? So: use a computable approximation.
Solomonoff induction is uncomputable? So: use a computable approximation.
What is the argument that the approximation you use is good? What I mean is, when you approximate you are making changes. Some possible changes you could make would create massive errors. Others—the type you are aiming for—only create small errors that don’t spread all over. What is your method of creating an approximation of the second type?
Making computable approximations of Solomonoff induction is a challenging field which it seems inappropriate to try and cram into a blog comment. Probably the short answer is “by using stochastic testing”.
There’s a large amount of math behind this sort of thing, and frankly, given your other comments I’m not sure that you have enough background. It might help to just read up on Bayeisian machine learning which needs to deal with just this sort of issue. Then keep in mind that there are theorems that given some fairly weak conditions to rule out pathological cases one approximate any distribution by a computable distribution to arbitrary accuracy. You need to be careful about what metric you are using but it turns out to be true for a variety of different notions of approximating and different metrics. While this is far from my area of expertise, so I’m by no means an expert on this, my impression is that the theorems are essentially of the same flavor as the theorems one would see in a real analysis course about approximating functions with continuous functions or polynomial functions.
If you think I’m mistaken, please say so and elaborate.
Solomonoff induction is uncomputable? So: use a computable approximation.
It’s hard for me to believe that you haven’t thought of this, but it’s difficult to “approximate” an uncomputable function. Think of any enormous computable function f(n) you like. Any putative “approximation” of the busy beaver function is off by a factor larger than f(n). I bet Solomonoff is similarly impossible to approximate—am I wrong?
I am not aware of any particular issues regarding Bayesian epistemology handling uncertainty about unproved mathematical truths. How is that different from other cases where there is uncertainty?
Using a computable approximation of Solomonoff induction is a standard approach. If you don’t have a perfect compressor, you just use the best one you have. It is the same with Solomonoff induction.
I am not aware of any particular issues regarding Bayesian epistemology handling uncertainty about unproved mathematical truths. How is that different from other cases where there is uncertainty?
I outlined the problem with mathematical uncertainty here. The only reason I believe that this is an open problem is on Yudkowsky’s say-so in his reply.
Using a computable approximation of Solomonoff induction is a standard approach. If you don’t have a perfect compressor, you just use the best one you have. It is the same with Solomonoff induction.
Standard approach to what? I don’t know what a “compressor” or “perfect compressor” is, if those are technical terms.
To me, the question is whether an approximation to Solomonoff induction has approximately the same behavior as Solomonoff induction. I think it can’t, for instance because no approximation of the busy beaver function (even the “best compressor you have”) behaves anything like the busy beaver function. If you think this is a misleading way of looking at it please tell me.
To me, the question is whether an approximation to Solomonoff induction has approximately the same behavior as Solomonoff induction. I think it can’t, for instance because no approximation of the busy beaver function (even the “best compressor you have”) behaves anything like the busy beaver function. If you think this is a misleading way of looking at it please tell me.
Solomonoff induction can’t handle the Busy Beaver function because Busy Beaver is non-computable. So it isn’t an issue for approximations of Solomonoff (except in so far as they can’t handle it either).
I am not saying that “Solomonoff can’t handle Busy Beaver.” (I’m not even sure I know what you mean.) I am saying that Solomonoff is analogous to Busy Beaver, for instance because they are both noncomputable functions. Busy Beaver is non-approximatable in a strong sense, and so I think that Solomonoff might also be non-approximatable in an equally strong sense.
Kolmogorov complexity is uncomputable, but you can usefully approximate Kolmogorov complexity for many applications using PKZIP. The same goes for Solomonoff induction. Its prior is based on Kolmogorov complexity.
I outlined the problem with mathematical uncertainty here.
Don’t agree with the premise there. As to what Yudkowsky is talking about by saying “Logical uncertainty is an open problem”—it beats me. There’s really only uncertainty. Uncertainty about mathematics is much the same as other kinds of uncertainty.
I can prove that to you, unless I made a mistake. Are you saying you can defeat it a priori by telling me a prior that doesn’t have any of those three properties?
Take induction, for example, where the domain of the function P(X) ranges over the possible symbols that might come next in the stream (or, if you prefer, ranges over the hypotheses that predict them).
Then P(X and Y) is typically not a meaningful concept.
It is not defined on all X—i.e. P is agnostic about some things
It has P(X) < P(X and Y) for at least one pair X and Y—i.e. P sometimes falls for the conjunction fallacy
It has P(X) = 1 for all mathematically provable statements X—i.e. P is an oracle.
Taken literally, your P falls for 1. For instance it doesn’t have an opinion about whether it will be sunny tomorrow or the Riemann hypothesis is true. If you wish avoid this problem by encoding the universe as a string of symbols to feed to your induction machine, why wouldn’t you let me encode “X and Y” at the same time?
Why do you think that?
Solomonoff induction is uncomputable? So: use a computable approximation.
What is the argument that the approximation you use is good? What I mean is, when you approximate you are making changes. Some possible changes you could make would create massive errors. Others—the type you are aiming for—only create small errors that don’t spread all over. What is your method of creating an approximation of the second type?
Making computable approximations of Solomonoff induction is a challenging field which it seems inappropriate to try and cram into a blog comment. Probably the short answer is “by using stochastic testing”.
There’s a large amount of math behind this sort of thing, and frankly, given your other comments I’m not sure that you have enough background. It might help to just read up on Bayeisian machine learning which needs to deal with just this sort of issue. Then keep in mind that there are theorems that given some fairly weak conditions to rule out pathological cases one approximate any distribution by a computable distribution to arbitrary accuracy. You need to be careful about what metric you are using but it turns out to be true for a variety of different notions of approximating and different metrics. While this is far from my area of expertise, so I’m by no means an expert on this, my impression is that the theorems are essentially of the same flavor as the theorems one would see in a real analysis course about approximating functions with continuous functions or polynomial functions.
If you think I’m mistaken, please say so and elaborate.
It’s hard for me to believe that you haven’t thought of this, but it’s difficult to “approximate” an uncomputable function. Think of any enormous computable function f(n) you like. Any putative “approximation” of the busy beaver function is off by a factor larger than f(n). I bet Solomonoff is similarly impossible to approximate—am I wrong?
I am not aware of any particular issues regarding Bayesian epistemology handling uncertainty about unproved mathematical truths. How is that different from other cases where there is uncertainty?
Using a computable approximation of Solomonoff induction is a standard approach. If you don’t have a perfect compressor, you just use the best one you have. It is the same with Solomonoff induction.
I outlined the problem with mathematical uncertainty here. The only reason I believe that this is an open problem is on Yudkowsky’s say-so in his reply.
Standard approach to what? I don’t know what a “compressor” or “perfect compressor” is, if those are technical terms.
To me, the question is whether an approximation to Solomonoff induction has approximately the same behavior as Solomonoff induction. I think it can’t, for instance because no approximation of the busy beaver function (even the “best compressor you have”) behaves anything like the busy beaver function. If you think this is a misleading way of looking at it please tell me.
Solomonoff induction can’t handle the Busy Beaver function because Busy Beaver is non-computable. So it isn’t an issue for approximations of Solomonoff (except in so far as they can’t handle it either).
I am not saying that “Solomonoff can’t handle Busy Beaver.” (I’m not even sure I know what you mean.) I am saying that Solomonoff is analogous to Busy Beaver, for instance because they are both noncomputable functions. Busy Beaver is non-approximatable in a strong sense, and so I think that Solomonoff might also be non-approximatable in an equally strong sense.
Kolmogorov complexity is uncomputable, but you can usefully approximate Kolmogorov complexity for many applications using PKZIP. The same goes for Solomonoff induction. Its prior is based on Kolmogorov complexity.
Don’t agree with the premise there. As to what Yudkowsky is talking about by saying “Logical uncertainty is an open problem”—it beats me. There’s really only uncertainty. Uncertainty about mathematics is much the same as other kinds of uncertainty.
What premise?
The first 5 lines of the post—this bit::
I can prove that to you, unless I made a mistake. Are you saying you can defeat it a priori by telling me a prior that doesn’t have any of those three properties?
Take induction, for example, where the domain of the function P(X) ranges over the possible symbols that might come next in the stream (or, if you prefer, ranges over the hypotheses that predict them).
Then P(X and Y) is typically not a meaningful concept.
The trichotomy is:
Taken literally, your P falls for 1. For instance it doesn’t have an opinion about whether it will be sunny tomorrow or the Riemann hypothesis is true. If you wish avoid this problem by encoding the universe as a string of symbols to feed to your induction machine, why wouldn’t you let me encode “X and Y” at the same time?