Intuitively, this limitation could be addressed by hooking up the AIXItl’s output channel to its source code.
The text you’ve quoted in the parent doesn’t seem to have anything to do with my point. I’m talking about plain vanilla AIXI/AIXItl. I’ve got nothing to say about self-modifying agents.
Let’s take a particular example you gave:
such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.
Let’s consider an AIXI with a Solomonoff induction unit that’s already been trained to understand physics to the level that we understand it in an outside-the-universe way. It starts receiving bits and rapidly (or maybe slowly, depends on the reward stream, who cares) learns that its input stream is consistent with EM radiation bouncing off of nearby objects. Conveniently, there is a mirror nearby…
Solomonoff induction will generate confabulations about the Solomonoff induction unit of the agent, but all the other parts of the agent run on computable physics, e.g., the CCD camera that generates the input stream, the actuators that mediate the effect of the output voltage. Time to hack the input registers to max out the the reward stream!
Plain vanilla AIXI/AIXItl doesn’t have a reward register. It has a reward channel. (It doesn’t save its rewards anywhere, it only acts to maximize the amount of reward signal on the input channel.)
I agree that a vanilla AIXI would abuse EM radiation to flip bits on its physical input channel to get higher rewards.
AIXItlmight be able to realize that the contents of its ram correlate with computations done by its Solomonoff inductor, but it won’t believe that changing the RAM will change the results of induction, and it wouldn’t pay a penny to prevent a cosmic ray from interfering with the inductor’s code.
From AIXI’s perspective, the code may be following along with the induction, but it isn’t actually doing the induction, and (AIXI thinks) wiping the code isn’t a big deal, because (AIXI thinks) it is a given that AIXI will act like AIXI in the future.
Now you could protest that AIXI will eventually learn to stop letting cosmic rays flip its bits because (by some miraculous coincidence) all such bit-flips result in lower expected rewards, and so it will learn to prevent them even while believing that the RAM doesn’t implement the induction.
And when I point out that this isn’t the case in all situations, you can call foul on games where it isn’t the case.
But both of these objections are silly; it should be obvious that an AIXI in such a situation is non-optimal, and I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
And then I quote V_V, which is how you know that this conversation is getting really surreal:
Then I don’t think we actually disagree.
I mean, it was well known that the AIXI proof of optimality requred ergodicity, since the original Hutter’s paper.
Plain vanilla AIXI/AIXItl doesn’t have a reward register.
Yeah, I changed that while your reply was in progress.
More to come later...
ETA: Later is now!
I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
I don’t think that AIXI is optimal under violations of ergodicity; I’m not talking about the optimality of AIXI at all. I’m talking about whether or not the Solomonoff induction part is capable of prompting AIXI to preseve itself.
I’m going to try to taboo “AIXI believes” and “AIXI thinks”. In hypothetical reality, the physically instantiated AIXI agent is a motherboard with sensors and actuators that are connected to the input and output pins, respectively, of a box labelled “Solomonoff Magic”. This agent is in a room. Somewhere in the space of all possible programs there are two programs. The first is just the maximally compressed version of the second, i.e., the first and the second give the same outputs on all possible inputs. The second one in written in Java, with a front-end interpreter that translates the Java program into the native language of the Solomonoff unit. (Java plus a prefix-free coding, blar blar blar). This program contains a human-readable physics simulation and an observation prediction routine. The initial conditions of the physics simulation match hypothetical reality except that the innards of the CPU are replaced by a computable approximation, including things like waste heat and whatnot. The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Thank you for your patience, Nate. The outside view wins again.
The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Actually, I find myself in a state of uncertainty as a result of doing a close reading section 2.6 of the Gentle Introduction to AIXI in light of your comment here. You quoted Paul Christiano as saying
Recall the definition of AIXI: A will try to infer a simple program which takes A’s outputs as input and provides A’s inputs as output, and then choose utility maximizing actions with respect to that program.
EY, Nate, Rob, and various commenters here (including myself until recently) all seemed to take this as given. For instance, above I wrote:
The simulation uses the input [i.e., action choice fed in as required by expectimax] to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages [emphasis added]
On this “program-that-takes-action-choice-as-an-input” view (perhaps inspired by a picture like that on page 7 of the Gentle Introduction and surrounding text), a simulated event like, say, a laser cutter slicing AIXI’s (sim-)physical instantiation in half, could sever the (sim-)causal connection from (sim-)AIXI’s input wire to its output wire, and this event would not change the fact that the simulation specifies the voltage on the output wire from the expectimax action choice.
Your claim, if I understand you correctly, is that the AIXI formalism does not actually express this kind of back-and-forth state swapping. Rather, for any given universe-modeling program, it simulates forward from the specification of the (sim-)input wire voltage (or does something computationally equivalent), not from a specification of the (sim-)output wire voltage. There is some universe-model which simulates a computable approximation of all of (sim-)AIXI’s physical state changes; once the end state of has been specified, real-AIXI gives zero weight all branches of the expectimax tree that do not have an action that matches the state of (sim-)AIXI’s output wire.
The text you’ve quoted in the parent doesn’t seem to have anything to do with my point. I’m talking about plain vanilla AIXI/AIXItl. I’ve got nothing to say about self-modifying agents.
Let’s take a particular example you gave:
Let’s consider an AIXI with a Solomonoff induction unit that’s already been trained to understand physics to the level that we understand it in an outside-the-universe way. It starts receiving bits and rapidly (or maybe slowly, depends on the reward stream, who cares) learns that its input stream is consistent with EM radiation bouncing off of nearby objects. Conveniently, there is a mirror nearby…
Solomonoff induction will generate confabulations about the Solomonoff induction unit of the agent, but all the other parts of the agent run on computable physics, e.g., the CCD camera that generates the input stream, the actuators that mediate the effect of the output voltage. Time to hack the input registers to max out the the reward stream!
Plain vanilla AIXI/AIXItl doesn’t have a reward register. It has a reward channel. (It doesn’t save its rewards anywhere, it only acts to maximize the amount of reward signal on the input channel.)
I agree that a vanilla AIXI would abuse EM radiation to flip bits on its physical input channel to get higher rewards.
AIXItl might be able to realize that the contents of its ram correlate with computations done by its Solomonoff inductor, but it won’t believe that changing the RAM will change the results of induction, and it wouldn’t pay a penny to prevent a cosmic ray from interfering with the inductor’s code.
From AIXI’s perspective, the code may be following along with the induction, but it isn’t actually doing the induction, and (AIXI thinks) wiping the code isn’t a big deal, because (AIXI thinks) it is a given that AIXI will act like AIXI in the future.
Now you could protest that AIXI will eventually learn to stop letting cosmic rays flip its bits because (by some miraculous coincidence) all such bit-flips result in lower expected rewards, and so it will learn to prevent them even while believing that the RAM doesn’t implement the induction.
And when I point out that this isn’t the case in all situations, you can call foul on games where it isn’t the case.
But both of these objections are silly; it should be obvious that an AIXI in such a situation is non-optimal, and I’m still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
And then I quote V_V, which is how you know that this conversation is getting really surreal:
Yeah, I changed that while your reply was in progress.
More to come later...
ETA: Later is now!
I don’t think that AIXI is optimal under violations of ergodicity; I’m not talking about the optimality of AIXI at all. I’m talking about whether or not the Solomonoff induction part is capable of prompting AIXI to preseve itself.
I’m going to try to taboo “AIXI believes” and “AIXI thinks”. In hypothetical reality, the physically instantiated AIXI agent is a motherboard with sensors and actuators that are connected to the input and output pins, respectively, of a box labelled “Solomonoff Magic”. This agent is in a room. Somewhere in the space of all possible programs there are two programs. The first is just the maximally compressed version of the second, i.e., the first and the second give the same outputs on all possible inputs. The second one in written in Java, with a front-end interpreter that translates the Java program into the native language of the Solomonoff unit. (Java plus a prefix-free coding, blar blar blar). This program contains a human-readable physics simulation and an observation prediction routine. The initial conditions of the physics simulation match hypothetical reality except that the innards of the CPU are replaced by a computable approximation, including things like waste heat and whatnot. The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI’s output voltages… ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation’ state propagates to simulated-AIXI’s input voltages, it doesn’t propagate all the way through to the output voltage.
Thank you for your patience, Nate. The outside view wins again.
Can you please expand?
Actually, I find myself in a state of uncertainty as a result of doing a close reading section 2.6 of the Gentle Introduction to AIXI in light of your comment here. You quoted Paul Christiano as saying
EY, Nate, Rob, and various commenters here (including myself until recently) all seemed to take this as given. For instance, above I wrote:
On this “program-that-takes-action-choice-as-an-input” view (perhaps inspired by a picture like that on page 7 of the Gentle Introduction and surrounding text), a simulated event like, say, a laser cutter slicing AIXI’s (sim-)physical instantiation in half, could sever the (sim-)causal connection from (sim-)AIXI’s input wire to its output wire, and this event would not change the fact that the simulation specifies the voltage on the output wire from the expectimax action choice.
Your claim, if I understand you correctly, is that the AIXI formalism does not actually express this kind of back-and-forth state swapping. Rather, for any given universe-modeling program, it simulates forward from the specification of the (sim-)input wire voltage (or does something computationally equivalent), not from a specification of the (sim-)output wire voltage. There is some universe-model which simulates a computable approximation of all of (sim-)AIXI’s physical state changes; once the end state of has been specified, real-AIXI gives zero weight all branches of the expectimax tree that do not have an action that matches the state of (sim-)AIXI’s output wire.
Do I have that about right?