Intuitively, this limitation could be addressed by hooking up the AIXItl’s output channel to its source code. Unfortunately, if you do that, the resulting formalism is no longer AIXItl.
It sounds to me like, although such an agent would very quickly self-modify into something other than AIXItl, it would be AIXItl at least on the first timestep (despite its assumption that its output does not change its source code being incorrect).
I expect that such an agent would perform very poorly, because it doesn’t start with a good model of self-modification, so the successor it replaces itself with would, with very high probability, not do anything useful. This is a problem for all agents that do not start off with detailed information about the environment, not just AIXI variants. The advantage that the example precommitment game player you provided has over AIXItl is not non-Cartesianism, but the fact that it was designed by someone who knows how the game mechanics work. It seems to me that the only way an agent that does not start off with strong assumptions about which program the environment is running could win is if self-modification is difficult enough that it could not accidentally self-modify into something useless before learning enough about its environment to protect itself.
My intuition is that the described AIXItl implementation fails because it’s implementation is too low-level. A higher-level AIXItl can succeed though, so it’s not a limitation in AIXItl. Consider the following program:
P1) Send the current machine state* as input to a ‘virtual’ AIXItl.
P2) Read the output of this AIXItl step, which will be a new program.
P3) Write a back up of the current machine state*. This could be in a non-executing register, for example.
P4) Replace the machine’s state (but not the backup!) to match the program provided by AIXItl.
Now, as AlexMennen notes, this machine will no longer be AIXItl and in all probability it will ‘brick’ itself. However, we can rectify this. The AIXItl agent is ‘virtual’ (ie. not directly hooked up to the machine’s IO), so we can interpret its output programs in a safe way:
We can use a total language, such that all outputted programs eventually halt.
We can prevent the language having (direct) access to the backup.
We can append a “then reload the backup” instruction to all programs.
This is still AIXItl, not a “variant”. It’s just running on a rather complex virtual machine. From AIXItl’s Cartesian point of view:
A1) Take in an observation, which will be provided in the form of a robot’s configuration.
A2) Output an action, in the form of a new robot configuration which will be run to completion.
A3) GOTO 1.
From an embodied viewpoint, we can see that the robot AIXItl thinks it’s programming doesn’t exactly correspond to the robot which actually exists (in particular, it doesn’t know that the real robot is also running AIXItl!). Also, where AIXItl measures time in terms of IO cycles, we can see that an arbitrary amount of time may pass between steps A1 and A2 (where AIXItl is ‘thinking’) and between steps A2 and A3 (where the robot is executing the new program, and AIXItl only exists in the backup).
This setup doesn’t solve all Cartesian problems, for example AIXItl doesn’t understand that it might die, it has no control over the backup (which a particularly egregious Omega might place restrictions on**) and the backup-and-restore scheme (just like anything else) might be interfered with by the environment. However, this article’s main thrust is that a machine running AIXItl is unable to rewrite its code, which is false.
Note that this doesn’t need to be complete; in particular, we can ignore the current state of execution. Only the “code” and sensor data need to be included.
** With more effort, we could have AIXItl’s output programs contain the backup and restore procedures, eg. validated by strong types. This would allow a choice of different backup strategies, depending on the environment (eg. “Omega wants this register to be empty, so I’ll write my backup to this hard drive instead, and be sure to restore it afterwards”)
It sounds to me like, although such an agent would very quickly self-modify into something other than AIXItl, it would be AIXItl at least on the first timestep (despite its assumption that its output does not change its source code being incorrect).
I expect that such an agent would perform very poorly, because it doesn’t start with a good model of self-modification, so the successor it replaces itself with would, with very high probability, not do anything useful. This is a problem for all agents that do not start off with detailed information about the environment, not just AIXI variants. The advantage that the example precommitment game player you provided has over AIXItl is not non-Cartesianism, but the fact that it was designed by someone who knows how the game mechanics work. It seems to me that the only way an agent that does not start off with strong assumptions about which program the environment is running could win is if self-modification is difficult enough that it could not accidentally self-modify into something useless before learning enough about its environment to protect itself.
My intuition is that the described AIXItl implementation fails because it’s implementation is too low-level. A higher-level AIXItl can succeed though, so it’s not a limitation in AIXItl. Consider the following program:
P1) Send the current machine state* as input to a ‘virtual’ AIXItl.
P2) Read the output of this AIXItl step, which will be a new program.
P3) Write a back up of the current machine state*. This could be in a non-executing register, for example.
P4) Replace the machine’s state (but not the backup!) to match the program provided by AIXItl.
Now, as AlexMennen notes, this machine will no longer be AIXItl and in all probability it will ‘brick’ itself. However, we can rectify this. The AIXItl agent is ‘virtual’ (ie. not directly hooked up to the machine’s IO), so we can interpret its output programs in a safe way:
We can use a total language, such that all outputted programs eventually halt.
We can prevent the language having (direct) access to the backup.
We can append a “then reload the backup” instruction to all programs.
This is still AIXItl, not a “variant”. It’s just running on a rather complex virtual machine. From AIXItl’s Cartesian point of view:
A1) Take in an observation, which will be provided in the form of a robot’s configuration. A2) Output an action, in the form of a new robot configuration which will be run to completion. A3) GOTO 1.
From an embodied viewpoint, we can see that the robot AIXItl thinks it’s programming doesn’t exactly correspond to the robot which actually exists (in particular, it doesn’t know that the real robot is also running AIXItl!). Also, where AIXItl measures time in terms of IO cycles, we can see that an arbitrary amount of time may pass between steps A1 and A2 (where AIXItl is ‘thinking’) and between steps A2 and A3 (where the robot is executing the new program, and AIXItl only exists in the backup).
This setup doesn’t solve all Cartesian problems, for example AIXItl doesn’t understand that it might die, it has no control over the backup (which a particularly egregious Omega might place restrictions on**) and the backup-and-restore scheme (just like anything else) might be interfered with by the environment. However, this article’s main thrust is that a machine running AIXItl is unable to rewrite its code, which is false.
Note that this doesn’t need to be complete; in particular, we can ignore the current state of execution. Only the “code” and sensor data need to be included.
** With more effort, we could have AIXItl’s output programs contain the backup and restore procedures, eg. validated by strong types. This would allow a choice of different backup strategies, depending on the environment (eg. “Omega wants this register to be empty, so I’ll write my backup to this hard drive instead, and be sure to restore it afterwards”)