I’ve always thought of it like, it doesn’t rely on the universe being computable, just on the universe having a computable approximation.
So if the universe is computable, SI does perfectly, if it’s not, SI does as well as any algorithm could hope to.
Yeah, I think that’s also a correct way of looking at it. However, I also think “hypotheses as reasoning methods” is a bit more intuitive.
When trying to predict what someone will say, it is hard to think “okay, what are the simplest models of the entire universe that have had decent predictive performance so far, and what do they predict now?”. Easier is “okay, what are the simplest ways to make predictions that have had decent predictive performance so far, and what do they predict now?”. (One such way to reason is with a model of the entire universe, so we don’t lose any generality this way.)
For example, if someone else is predicting things better than me, I should try to understand why. And you can vaguely understand this process in terms of Solomonoff induction. For example, it gives you a precise way to reason about whether you should copy the reasoning of people who win the lottery.
Paul Christiano speculated that the universal prior is in fact mostly just intelligences doing reasoning. Making an intelligence is simple after all: set up a simple cellular automata that tends to develop lifeforms, wait 3^^^^3 years, and then look around. (See What does the universal prior actually look like? or the exposition at The Solomonoff Prior is Malign.)
Yeah, I think that’s also a correct way of looking at it. However, I also think “hypotheses as reasoning methods” is a bit more intuitive
SIs don’t engage in a wide variety of types of reasoning, it’s all variations on the same theme.
SI is limited compared to humans. It can’t include itself in a model, it can’t contemplate a non-turing-computable world …. and in many ways it’s limited to instrumentalism, to predicting the next observation. A human can state “suppose the world is non computable”—how can that be expressed as a programme? Humans, despite being finite, can do all those things. An SI can test an infinite number of (instrumental) hypotheses, but they are all of the same type: It’s important not to confuse “infinite” and “every”: the set of multiples of 23 is infinite, but does not contain every number
And you can vaguely understand this process in terms of Solomonoff induction
SI isn’t theoretically useful as a way of understanding human thought. Humans can’t brute-force search every possible hypothesis , and must be doing something more sophisticated instead to come up with good hypotheses.
But we are talking about SI. An SI isn’t making English statements. What is true of a GPT is not necessarily true of an SI.
The instructions in a programme executed by an SI have semantics related to programme operations, but not to the outside world, because all machine code does. Machine code instructions do things like “Add 1 to register A”. You would have to look at thousands or millions of such low level instructions to infer what kind of kind high level maths—vector spaces , or non Euclidean geometry—the programme is executing.
And it’s hard to see how you know with certainty that SI is describing an uncomputable or random universe. If it is using limited precision floating point calculations, is that an approximate representation of unlimited precision real number calculations taking place in the territory? Or should it be taken literally? if it uses pseudo-random number generation, does it believe that there is real indeterminism in the territory? Human scientists are also limited in the kind of maths they can use, but again, can communicate verbally what it is supposed to mean, how exact it is, and so on.
I’ve always thought of it like, it doesn’t rely on the universe being computable, just on the universe having a computable approximation. So if the universe is computable, SI does perfectly, if it’s not, SI does as well as any algorithm could hope to.
Not really. It’s superior to all algorithms running on a Turing machine, and actually is superior to an algorithm running on a accelerating turing machine, because it actually has the complement of the recursively enumerable sets, since it’s a 1st level halting oracle, which is very nice, but compared to the most powerful computers/reasoning engines known to mathematics, it’s way less powerful. It’s optimal in the domain of computable universes, and with the resources available to a Solomonoff Inductor, it can create a halting oracle, which lets it predict the first level uncomputable sequences like Chaitin’s constant, but not anything more, which is a major limitation compared to what the champions/currently optimal machines are for reasoning.
So depending on how much compute we give the human in the form of intuition, it could very easily beat Solonomoff Induction.
which lets it predict the first level uncomputable sequences like Chaitin’s constant
Do you have a proof/source for this? I haven’t heard it before.
I know in particular that is assigns a probability of 0 to Chaitin’s constant (because all the hypotheses are computable). Are you saying it can predict the prefixes of Chaitin’s constant better than random? I haven’t heard this claim either.
Hm, I might have confused what you are allowed to do if you had enough compute to run Solonomoff Induction with Solomonoff induction itself, so that’s maybe the issue I had here.
If I wanted to create my own argument for why Solomonoff induction could do it, it’s because that it’s essentially a halting oracle, which allows it to compute/create all the digits of Chaitin’s constants, given that it can compute in general the recursively enumerable sets, and since it can compute all the digits of Chaitin’s constants, it can basically read them like a book, and thus it has predicted the sequence of digits.
Solomonoff induction is a specific probability distribution. It isn’t making “decisions” per se. It can’t notice that it’s existence implies that there is a halting oracle, and that it therefore can predict one. This is because, in general, Solomonoff induction is not embedded.
If there was a physical process for a halting oracle, that would be pretty sick because then we could just run Solomonoff induction. As shown in my post, we don’t need to worry that there might be an even better strategy in such a universe; the hypotheses of Solomonoff induction can take advantage of the halting oracle just as well as we can!
As shown in my post, we don’t need to worry that there might be an even better strategy in such a universe; the hypotheses of Solomonoff induction can take advantage of the halting oracle just as well as we can!
You do mention that the methods of reasoning have to be computable for this to work, and there I’m quite a bit more skeptical of that condition holding.
I’ve always thought of it like, it doesn’t rely on the universe being computable, just on the universe having a computable approximation. So if the universe is computable, SI does perfectly, if it’s not, SI does as well as any algorithm could hope to.
Yeah, I think that’s also a correct way of looking at it. However, I also think “hypotheses as reasoning methods” is a bit more intuitive.
When trying to predict what someone will say, it is hard to think “okay, what are the simplest models of the entire universe that have had decent predictive performance so far, and what do they predict now?”. Easier is “okay, what are the simplest ways to make predictions that have had decent predictive performance so far, and what do they predict now?”. (One such way to reason is with a model of the entire universe, so we don’t lose any generality this way.)
For example, if someone else is predicting things better than me, I should try to understand why. And you can vaguely understand this process in terms of Solomonoff induction. For example, it gives you a precise way to reason about whether you should copy the reasoning of people who win the lottery.
Paul Christiano speculated that the universal prior is in fact mostly just intelligences doing reasoning. Making an intelligence is simple after all: set up a simple cellular automata that tends to develop lifeforms, wait 3^^^^3 years, and then look around. (See What does the universal prior actually look like? or the exposition at The Solomonoff Prior is Malign.)
SIs don’t engage in a wide variety of types of reasoning, it’s all variations on the same theme.
SI is limited compared to humans. It can’t include itself in a model, it can’t contemplate a non-turing-computable world …. and in many ways it’s limited to instrumentalism, to predicting the next observation. A human can state “suppose the world is non computable”—how can that be expressed as a programme? Humans, despite being finite, can do all those things. An SI can test an infinite number of (instrumental) hypotheses, but they are all of the same type: It’s important not to confuse “infinite” and “every”: the set of multiples of 23 is infinite, but does not contain every number
SI isn’t theoretically useful as a way of understanding human thought. Humans can’t brute-force search every possible hypothesis , and must be doing something more sophisticated instead to come up with good hypotheses.
The same way a human can? GPT-4 can state “suppose the world is non computable” for example.
But we are talking about SI. An SI isn’t making English statements. What is true of a GPT is not necessarily true of an SI.
The instructions in a programme executed by an SI have semantics related to programme operations, but not to the outside world, because all machine code does. Machine code instructions do things like “Add 1 to register A”. You would have to look at thousands or millions of such low level instructions to infer what kind of kind high level maths—vector spaces , or non Euclidean geometry—the programme is executing.
And it’s hard to see how you know with certainty that SI is describing an uncomputable or random universe. If it is using limited precision floating point calculations, is that an approximate representation of unlimited precision real number calculations taking place in the territory? Or should it be taken literally? if it uses pseudo-random number generation, does it believe that there is real indeterminism in the territory? Human scientists are also limited in the kind of maths they can use, but again, can communicate verbally what it is supposed to mean, how exact it is, and so on.
Not really. It’s superior to all algorithms running on a Turing machine, and actually is superior to an algorithm running on a accelerating turing machine, because it actually has the complement of the recursively enumerable sets, since it’s a 1st level halting oracle, which is very nice, but compared to the most powerful computers/reasoning engines known to mathematics, it’s way less powerful. It’s optimal in the domain of computable universes, and with the resources available to a Solomonoff Inductor, it can create a halting oracle, which lets it predict the first level uncomputable sequences like Chaitin’s constant, but not anything more, which is a major limitation compared to what the champions/currently optimal machines are for reasoning.
So depending on how much compute we give the human in the form of intuition, it could very easily beat Solonomoff Induction.
Do you have a proof/source for this? I haven’t heard it before.
I know in particular that is assigns a probability of 0 to Chaitin’s constant (because all the hypotheses are computable). Are you saying it can predict the prefixes of Chaitin’s constant better than random? I haven’t heard this claim either.
Hm, I might have confused what you are allowed to do if you had enough compute to run Solonomoff Induction with Solomonoff induction itself, so that’s maybe the issue I had here.
If I wanted to create my own argument for why Solomonoff induction could do it, it’s because that it’s essentially a halting oracle, which allows it to compute/create all the digits of Chaitin’s constants, given that it can compute in general the recursively enumerable sets, and since it can compute all the digits of Chaitin’s constants, it can basically read them like a book, and thus it has predicted the sequence of digits.
Solomonoff induction is a specific probability distribution. It isn’t making “decisions” per se. It can’t notice that it’s existence implies that there is a halting oracle, and that it therefore can predict one. This is because, in general, Solomonoff induction is not embedded.
If there was a physical process for a halting oracle, that would be pretty sick because then we could just run Solomonoff induction. As shown in my post, we don’t need to worry that there might be an even better strategy in such a universe; the hypotheses of Solomonoff induction can take advantage of the halting oracle just as well as we can!
You do mention that the methods of reasoning have to be computable for this to work, and there I’m quite a bit more skeptical of that condition holding.