Hooray! I was hoping we’d be able to find some common ground.
My concerns with AIXI and AIXItl are largely of the form “this model doesn’t quite capture what I’m looking for” and “AGI is not reduced to building better and better approximations of AIXI”. These seem like fairly week and obvious claims to me, so I’m glad they are not contested.
(From some of the previous discussion, I was not sure that we agreed here.)
My objection was to the claim that the Solomonoff updating rule is incapable of learning these facts.
Cool. I make no such claim here.
Since you seem really interested in talking about Solomonoff induction and it’s ability to deal with these situations, I’ll say a few things that I think we can both agree upon. Correct me if I’m wrong:
A Solomonoff inductor outside of a computable universe, with a search space including Turing machines large enough to compute this universe, and with access to sufficient information, will in the limit construct a perfect model of the universe.
An AI sitting outside of a computable universe interacting only via I/O channels and using Solomonoff induction as above will in the limit have a perfect model of the universe, and will thus be able to act optimally.
A Solomonoff inductor inside its universe cannot consider hypotheses that actually perfectly describe the universe.
Agents inside the universe using Solomonoff induction thus lack the strict optimality-in-the-limit that an extra-universal AIXI would possess.
Does that mean they are dumb? No, of course not. Nothing that you can do inside the universe is going to give you the optimality principle of an AIXI that is actually sitting outside the universe using a hypercomputer. You can’t get a perfect model of the universe from inside the universe, and it’s unreasonable to expect that you should.
While doing Solomonoff induction inside the universe can never give you a perfect model, it can indeed get you a good computable approximation (one of the best computable approximations around, in fact).
(I assume we agree so far?)
The thing is, when we’re inside a universe and we can’t have that optimality principle, I already know how to build the best universe model that I can: I just do Bayesian updates using all my evidence. I don’t need new intractable methods for building good environment models, because I already have one. The problem, of course, is that to be a perfect Bayesian, I need a good prior.
And in fact, Solomonoff induction is just Bayesian updating with a Komolgorov prior. So of course it will give you good results. As I stated here, I don’t view my concerns with Solomonoff induction as an “induction problem” but rather as a “priors problem”: Solomonoff induction works very well (and, indeed, is basically just Bayesian updating), but the question is, did it pick the right prior?
Maybe Komolgorov complexity priors will turn out to be the correct answer, but I’m far from convinced (for a number of reasons that go beyond the scope of this discussion). Regardless, though, Solomonoff induction surely gives you the best model you can get given the prior.
(In fact, the argument with AlexMennen is an argument about whether AIXItl’s prior stating that it is absolutely impossible that the universe is a Turing machine with length > l is bad enough to hurt AIXItl. I won’t hop into that argument today, but instead I will note that this line of argument does not seem like a good way to approach the question of which prior we should use.)
I’m not trying to condemn Solomonoff induction in this post. I’m trying to illustrate the fact that even if you could build an AIXItl, it wouldn’t be an ideal agent.
There’s one obvious way to embed AIXItl into its environment (hook its output register to its motor output channel) that prevents it from self-modifying, which results in failures. There’s another way to embed AIXItl into its environment (hook its program registers to its output channel) that requires you to do a lot more work before the variant becomes useful.
Is it possible to make an AIXItl variant useful in the latter case? Sure, probably, but this seems like a pretty backwards way to go about studying self-modification when we could just use a toy model that was designed to study this problem in the first place.
As an aside, I’m betting that we disagree less than you think. I spent some time carefully laying out my concerns in this post, and alluding to other concerns that I didn’t have time to cover (e.g., that the Legg-Hutter intelligence metric fails to capture some aspects of intelligence that I find important), in an attempt to make my position very clear. From your own response, it sounds like you largely agree with my concerns.
And yet, you still put very different words about very different concerns into my mouth when arguing with other people against positions that you believed I held.
I find this somewhat frustrating, and while I’m sure it was an honest mistake, I hope that you will be a bit more careful in the future.
From your own response, it sounds like you largely agree with my concerns.
Yup.
I’m not trying to condemn Solomonoff induction in this post… And yet, you still put very different words about very different concerns into my mouth when arguing with other people against positions that you believed I held.
When I was talking about positions I believe you have held (and may currently still hold?), I was referring to your words in a previous post:
the agent/environment separation is somewhat reminiscent of Cartesian dualism: any agent using this framework to reason about the world does not model itself as part of its environment. For example, such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.
I find this somewhat frustrating, and while I’m sure it was an honest mistake, I hope that you will be a bit more careful in the future.
I appreciate the way you’ve stated this concern. Comity!
Yeah, I stand by that quote. And yet, when I made my concerns more explicit:
Intuitively, this limitation could be addressed by hooking up the AIXItl’s output channel to its source code. Unfortunately, if you do that, the resulting formalism is no longer AIXItl.
This is not just a technical quibble: We can say many useful things about AIXI, such as “the more input it gets the more accurate its environment model becomes”. On the other hand, we can’t say much at all about an agent that chooses its new source code: we can’t even be sure whether the new agent will still have an environment model!
It may be possible to give an AIXItl variant access to its program registers and then train it such that it acts like an AIXItl most of the time, but such that it can also learn to win the Precommitment game. However, it’s not immediately obvious to us how to do this, or even whether it can be done. This is a possibility that we’d be interested in studying further.
You said you agree, but then still made claims like this:
Either (1) I am wrong and stupid; or (2) they are wrong and stupid.
It sounds like you retained an incorrect interpretation of my words, even after I tried to make them clear in the above post and previous comment. If you still feel that the intended interpretation is unclear, please let me know and I’ll clarify further.
Intuitively, this limitation could be addressed by hooking up the AIXItl’s output channel to its source code.
The text you’ve quoted in the parent doesn’t seem to have anything to do with my point. I’m talking about plain vanilla AIXI/AIXItl. I’ve got nothing to say about self-modifying agents.
Let’s take a particular example you gave:
such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.
Let’s consider an AIXI with a Solomonoff induction unit that’s already been trained to understand physics to the level that we understand it in an outside-the-universe way. It starts receiving bits and rapidly (or maybe slowly, depends on the reward stream, who cares) learns that its input stream is consistent with EM radiation bouncing off of nearby objects. Conveniently, there is a mirror nearby…
Solomonoff induction will generate confabulations about the Solomonoff induction unit of the agent, but all the other parts of the agent run on computable physics, e.g., the CCD camera that generates the input stream, the actuators that mediate the effect of the output voltage. Time to hack the input registers to max out the the reward stream!
Plain vanilla AIXI/AIXItl doesn’t have a reward register. It has a reward channel. (It doesn’t save its rewards anywhere, it only acts to maximize the amount of reward signal on the input channel.)
I agree that a vanilla AIXI would abuse EM radiation to flip bits on its physical input channel to get higher rewards.
AIXItlmight be able to realize that the contents of its ram correlate with computations done by its Solomonoff inductor, but it won’t believe that changing the RAM will change the results of induction, and it wouldn’t pay a penny to prevent a cosmic ray from interfering with the inductor’s code.
From AIXI’s perspective, the code may be following along with the induction, but it isn’t actually doing the induction, and (AIXI thinks) wiping the code isn’t a big deal, because (AIXI thinks) it is a given that AIXI will act like AIXI in the future.
Now you could protest that AIXI will eventually learn to stop letting cosmic rays flip its bits because (by some miraculous coincidence) all such bit-flips result in lower expected rewards, and so it will learn to prevent them even while believing that the RAM doesn’t implement the induction.
And when I point out that this isn’t the case in all situations, you can call foul on games where it isn’t the case.
But both of these objections are silly; it should be obvious that an AIXI in such a situation is non-optimal, and I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
And then I quote V_V, which is how you know that this conversation is getting really surreal:
Then I don’t think we actually disagree.
I mean, it was well known that the AIXI proof of optimality requred ergodicity, since the original Hutter’s paper.
Plain vanilla AIXI/AIXItl doesn’t have a reward register.
Yeah, I changed that while your reply was in progress.
More to come later...
ETA: Later is now!
I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
I don’t think that AIXI is optimal under violations of ergodicity; I’m not talking about the optimality of AIXI at all. I’m talking about whether or not the Solomonoff induction part is capable of prompting AIXI to preseve itself.
I’m going to try to taboo “AIXI believes” and “AIXI thinks”. In hypothetical reality, the physically instantiated AIXI agent is a motherboard with sensors and actuators that are connected to the input and output pins, respectively, of a box labelled “Solomonoff Magic”. This agent is in a room. Somewhere in the space of all possible programs there are two programs. The first is just the maximally compressed version of the second, i.e., the first and the second give the same outputs on all possible inputs. The second one in written in Java, with a front-end interpreter that translates the Java program into the native language of the Solomonoff unit. (Java plus a prefix-free coding, blar blar blar). This program contains a human-readable physics simulation and an observation prediction routine. The initial conditions of the physics simulation match hypothetical reality except that the innards of the CPU are replaced by a computable approximation, including things like waste heat and whatnot. The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Thank you for your patience, Nate. The outside view wins again.
The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Actually, I find myself in a state of uncertainty as a result of doing a close reading section 2.6 of the Gentle Introduction to AIXI in light of your comment here. You quoted Paul Christiano as saying
Recall the definition of AIXI: A will try to infer a simple program which takes A’s outputs as input and provides A’s inputs as output, and then choose utility maximizing actions with respect to that program.
EY, Nate, Rob, and various commenters here (including myself until recently) all seemed to take this as given. For instance, above I wrote:
The simulation uses the input [i.e., action choice fed in as required by expectimax] to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages [emphasis added]
On this “program-that-takes-action-choice-as-an-input” view (perhaps inspired by a picture like that on page 7 of the Gentle Introduction and surrounding text), a simulated event like, say, a laser cutter slicing AIXI’s (sim-)physical instantiation in half, could sever the (sim-)causal connection from (sim-)AIXI’s input wire to its output wire, and this event would not change the fact that the simulation specifies the voltage on the output wire from the expectimax action choice.
Your claim, if I understand you correctly, is that the AIXI formalism does not actually express this kind of back-and-forth state swapping. Rather, for any given universe-modeling program, it simulates forward from the specification of the (sim-)input wire voltage (or does something computationally equivalent), not from a specification of the (sim-)output wire voltage. There is some universe-model which simulates a computable approximation of all of (sim-)AIXI’s physical state changes; once the end state of has been specified, real-AIXI gives zero weight all branches of the expectimax tree that do not have an action that matches the state of (sim-)AIXI’s output wire.
Hooray! I was hoping we’d be able to find some common ground.
My concerns with AIXI and AIXItl are largely of the form “this model doesn’t quite capture what I’m looking for” and “AGI is not reduced to building better and better approximations of AIXI”. These seem like fairly week and obvious claims to me, so I’m glad they are not contested.
(From some of the previous discussion, I was not sure that we agreed here.)
Cool. I make no such claim here.
Since you seem really interested in talking about Solomonoff induction and it’s ability to deal with these situations, I’ll say a few things that I think we can both agree upon. Correct me if I’m wrong:
A Solomonoff inductor outside of a computable universe, with a search space including Turing machines large enough to compute this universe, and with access to sufficient information, will in the limit construct a perfect model of the universe.
An AI sitting outside of a computable universe interacting only via I/O channels and using Solomonoff induction as above will in the limit have a perfect model of the universe, and will thus be able to act optimally.
A Solomonoff inductor inside its universe cannot consider hypotheses that actually perfectly describe the universe.
Agents inside the universe using Solomonoff induction thus lack the strict optimality-in-the-limit that an extra-universal AIXI would possess.
Does that mean they are dumb? No, of course not. Nothing that you can do inside the universe is going to give you the optimality principle of an AIXI that is actually sitting outside the universe using a hypercomputer. You can’t get a perfect model of the universe from inside the universe, and it’s unreasonable to expect that you should.
While doing Solomonoff induction inside the universe can never give you a perfect model, it can indeed get you a good computable approximation (one of the best computable approximations around, in fact).
(I assume we agree so far?)
The thing is, when we’re inside a universe and we can’t have that optimality principle, I already know how to build the best universe model that I can: I just do Bayesian updates using all my evidence. I don’t need new intractable methods for building good environment models, because I already have one. The problem, of course, is that to be a perfect Bayesian, I need a good prior.
And in fact, Solomonoff induction is just Bayesian updating with a Komolgorov prior. So of course it will give you good results. As I stated here, I don’t view my concerns with Solomonoff induction as an “induction problem” but rather as a “priors problem”: Solomonoff induction works very well (and, indeed, is basically just Bayesian updating), but the question is, did it pick the right prior?
Maybe Komolgorov complexity priors will turn out to be the correct answer, but I’m far from convinced (for a number of reasons that go beyond the scope of this discussion). Regardless, though, Solomonoff induction surely gives you the best model you can get given the prior.
(In fact, the argument with AlexMennen is an argument about whether AIXItl’s prior stating that it is absolutely impossible that the universe is a Turing machine with length > l is bad enough to hurt AIXItl. I won’t hop into that argument today, but instead I will note that this line of argument does not seem like a good way to approach the question of which prior we should use.)
I’m not trying to condemn Solomonoff induction in this post. I’m trying to illustrate the fact that even if you could build an AIXItl, it wouldn’t be an ideal agent.
There’s one obvious way to embed AIXItl into its environment (hook its output register to its motor output channel) that prevents it from self-modifying, which results in failures. There’s another way to embed AIXItl into its environment (hook its program registers to its output channel) that requires you to do a lot more work before the variant becomes useful.
Is it possible to make an AIXItl variant useful in the latter case? Sure, probably, but this seems like a pretty backwards way to go about studying self-modification when we could just use a toy model that was designed to study this problem in the first place.
As an aside, I’m betting that we disagree less than you think. I spent some time carefully laying out my concerns in this post, and alluding to other concerns that I didn’t have time to cover (e.g., that the Legg-Hutter intelligence metric fails to capture some aspects of intelligence that I find important), in an attempt to make my position very clear. From your own response, it sounds like you largely agree with my concerns.
And yet, you still put very different words about very different concerns into my mouth when arguing with other people against positions that you believed I held.
I find this somewhat frustrating, and while I’m sure it was an honest mistake, I hope that you will be a bit more careful in the future.
Yup.
When I was talking about positions I believe you have held (and may currently still hold?), I was referring to your words in a previous post:
I appreciate the way you’ve stated this concern. Comity!
Yeah, I stand by that quote. And yet, when I made my concerns more explicit:
You said you agree, but then still made claims like this:
It sounds like you retained an incorrect interpretation of my words, even after I tried to make them clear in the above post and previous comment. If you still feel that the intended interpretation is unclear, please let me know and I’ll clarify further.
The text you’ve quoted in the parent doesn’t seem to have anything to do with my point. I’m talking about plain vanilla AIXI/AIXItl. I’ve got nothing to say about self-modifying agents.
Let’s take a particular example you gave:
Let’s consider an AIXI with a Solomonoff induction unit that’s already been trained to understand physics to the level that we understand it in an outside-the-universe way. It starts receiving bits and rapidly (or maybe slowly, depends on the reward stream, who cares) learns that its input stream is consistent with EM radiation bouncing off of nearby objects. Conveniently, there is a mirror nearby…
Solomonoff induction will generate confabulations about the Solomonoff induction unit of the agent, but all the other parts of the agent run on computable physics, e.g., the CCD camera that generates the input stream, the actuators that mediate the effect of the output voltage. Time to hack the input registers to max out the the reward stream!
Plain vanilla AIXI/AIXItl doesn’t have a reward register. It has a reward channel. (It doesn’t save its rewards anywhere, it only acts to maximize the amount of reward signal on the input channel.)
I agree that a vanilla AIXI would abuse EM radiation to flip bits on its physical input channel to get higher rewards.
AIXItl might be able to realize that the contents of its ram correlate with computations done by its Solomonoff inductor, but it won’t believe that changing the RAM will change the results of induction, and it wouldn’t pay a penny to prevent a cosmic ray from interfering with the inductor’s code.
From AIXI’s perspective, the code may be following along with the induction, but it isn’t actually doing the induction, and (AIXI thinks) wiping the code isn’t a big deal, because (AIXI thinks) it is a given that AIXI will act like AIXI in the future.
Now you could protest that AIXI will eventually learn to stop letting cosmic rays flip its bits because (by some miraculous coincidence) all such bit-flips result in lower expected rewards, and so it will learn to prevent them even while believing that the RAM doesn’t implement the induction.
And when I point out that this isn’t the case in all situations, you can call foul on games where it isn’t the case.
But both of these objections are silly; it should be obvious that an AIXI in such a situation is non-optimal, and I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
And then I quote V_V, which is how you know that this conversation is getting really surreal:
Yeah, I changed that while your reply was in progress.
More to come later...
ETA: Later is now!
I don’t think that AIXI is optimal under violations of ergodicity; I’m not talking about the optimality of AIXI at all. I’m talking about whether or not the Solomonoff induction part is capable of prompting AIXI to preseve itself.
I’m going to try to taboo “AIXI believes” and “AIXI thinks”. In hypothetical reality, the physically instantiated AIXI agent is a motherboard with sensors and actuators that are connected to the input and output pins, respectively, of a box labelled “Solomonoff Magic”. This agent is in a room. Somewhere in the space of all possible programs there are two programs. The first is just the maximally compressed version of the second, i.e., the first and the second give the same outputs on all possible inputs. The second one in written in Java, with a front-end interpreter that translates the Java program into the native language of the Solomonoff unit. (Java plus a prefix-free coding, blar blar blar). This program contains a human-readable physics simulation and an observation prediction routine. The initial conditions of the physics simulation match hypothetical reality except that the innards of the CPU are replaced by a computable approximation, including things like waste heat and whatnot. The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Thank you for your patience, Nate. The outside view wins again.
Can you please expand?
Actually, I find myself in a state of uncertainty as a result of doing a close reading section 2.6 of the Gentle Introduction to AIXI in light of your comment here. You quoted Paul Christiano as saying
EY, Nate, Rob, and various commenters here (including myself until recently) all seemed to take this as given. For instance, above I wrote:
On this “program-that-takes-action-choice-as-an-input” view (perhaps inspired by a picture like that on page 7 of the Gentle Introduction and surrounding text), a simulated event like, say, a laser cutter slicing AIXI’s (sim-)physical instantiation in half, could sever the (sim-)causal connection from (sim-)AIXI’s input wire to its output wire, and this event would not change the fact that the simulation specifies the voltage on the output wire from the expectimax action choice.
Your claim, if I understand you correctly, is that the AIXI formalism does not actually express this kind of back-and-forth state swapping. Rather, for any given universe-modeling program, it simulates forward from the specification of the (sim-)input wire voltage (or does something computationally equivalent), not from a specification of the (sim-)output wire voltage. There is some universe-model which simulates a computable approximation of all of (sim-)AIXI’s physical state changes; once the end state of has been specified, real-AIXI gives zero weight all branches of the expectimax tree that do not have an action that matches the state of (sim-)AIXI’s output wire.
Do I have that about right?