If we forget the specific references to AIXI, and simply stipulate that “A” is an expected-utility maximizer with good inferential capabilities and an arbitrary utility function, what happens? If the choice is between Model 1 and Model 2, then Model 1 should be simpler because it doesn’t contain an A-shaped hole in the universe, inside which different laws apply. In fact, I don’t even understand why Model 2 makes sense—it ignores the causal role that the output wire’s behavior plays for “rest of the universe”. According to Model 2, there is a steady stream of uncaused events occurring at the output wire, which then feed into the cause-and-effect of the rest of the universe, and there is a steady stream of events which have no effect on anything else (unlike all other events in the universe), at the input wire.
I see no reason why the AI can’t infer the existence of something like itself, playing a causal role in its region of the universe. It cannot contain a complete simulation of itself, but then it can’t contain a complete simulation of the whole universe outside its box, either. Its cognitive resources are necessarily bounded, both in respect of its ability to test all possible causal models against the data, and in respect of its ability to store data, or store a detailed model of the universe. It will necessarily be operating heuristically, but it should be capable of coming up with the idea of a big universe with universal laws, and it should be capable of extracting valuable guides to action from that concept, even though it doesn’t and can’t build a completely detailed model of the whole universe.
So how might its evolving model of the universe look? A bit like the paradigm of Hashlife applied to real physics. Hashlife is a simulation of Conway’s Game of Life in which you develop lookup tables for recurring configurations: instead of working your way through the computations again, you just use the lookup table (and possibly skip over many Game-of-Life steps, to a final configuration). In the real world, we have, let’s say, the standard model coupled to gravity, and above that quantum chemistry, and above that a description of the world as continuum objects with stress and strain properties. If you are an agent acting in that world, you may characterize the states of the objects around you at varying degrees of resolution, and apply principles from different “levels” of physics in order to reason about their future behavior, but the high-level principles are supposed to be deductively consistent with the bottom level being the actual reality everywhere.
If our AI is just a state machine, internally evolving a finite state machine that will guide its actions, then it can still develop a similar multilayered “Hashphysics” model of the world. It may not have, declaratively stored in it anywhere, the proposition “the world is made of atoms”, but it can certainly have a multilevel model of the world-state which is updated on the basis of rules functionally equivalent to such a belief. The state-of-the-world model at any time might look like: stuff arranged in Euclidean space, extending to infinity in all directions; an assortment of macroscopic objects in the immediate vicinity, with specified approximate shapes, textures, and other properties; a belief that certain objects are made of specific substances, implying that they are made of a population of specific molecular entities presenting a particular thermodynamic profile; and finally, should the AI find itself interacting directly with a specific microscopic entity, that will be modeled in terms consistent with fundamental physics. The rules for updating the concrete world-model at different levels would encompass, not just positing the existence of greater detail when that becomes necessary, but also forgetting low-level physical details once they became irrelevant.
As it interacts with the world and infers more and more about the specific objects existing in the world, I see every reason for it to eventually posit its own existence—of course, it won’t know or think “that is me”; but there will be an object in the world-model which is the representation of itself. (Eventually, there might even be an object in the world-model representing its own self-representation...)
If the choice is between Model 1 and Model 2, then Model 1 should be simpler because it doesn’t contain an A-shaped hole in the universe, inside which different laws apply.
Model 2 doesn’t have this hole, it includes description of the agent as part of the world. However, belief in it doesn’t have any effect on A’s decisions, so this point is of unclear relevance.
If we forget the specific references to AIXI, and simply stipulate that “A” is an expected-utility maximizer with good inferential capabilities and an arbitrary utility function, what happens? If the choice is between Model 1 and Model 2, then Model 1 should be simpler because it doesn’t contain an A-shaped hole in the universe, inside which different laws apply. In fact, I don’t even understand why Model 2 makes sense—it ignores the causal role that the output wire’s behavior plays for “rest of the universe”. According to Model 2, there is a steady stream of uncaused events occurring at the output wire, which then feed into the cause-and-effect of the rest of the universe, and there is a steady stream of events which have no effect on anything else (unlike all other events in the universe), at the input wire.
I see no reason why the AI can’t infer the existence of something like itself, playing a causal role in its region of the universe. It cannot contain a complete simulation of itself, but then it can’t contain a complete simulation of the whole universe outside its box, either. Its cognitive resources are necessarily bounded, both in respect of its ability to test all possible causal models against the data, and in respect of its ability to store data, or store a detailed model of the universe. It will necessarily be operating heuristically, but it should be capable of coming up with the idea of a big universe with universal laws, and it should be capable of extracting valuable guides to action from that concept, even though it doesn’t and can’t build a completely detailed model of the whole universe.
So how might its evolving model of the universe look? A bit like the paradigm of Hashlife applied to real physics. Hashlife is a simulation of Conway’s Game of Life in which you develop lookup tables for recurring configurations: instead of working your way through the computations again, you just use the lookup table (and possibly skip over many Game-of-Life steps, to a final configuration). In the real world, we have, let’s say, the standard model coupled to gravity, and above that quantum chemistry, and above that a description of the world as continuum objects with stress and strain properties. If you are an agent acting in that world, you may characterize the states of the objects around you at varying degrees of resolution, and apply principles from different “levels” of physics in order to reason about their future behavior, but the high-level principles are supposed to be deductively consistent with the bottom level being the actual reality everywhere.
If our AI is just a state machine, internally evolving a finite state machine that will guide its actions, then it can still develop a similar multilayered “Hashphysics” model of the world. It may not have, declaratively stored in it anywhere, the proposition “the world is made of atoms”, but it can certainly have a multilevel model of the world-state which is updated on the basis of rules functionally equivalent to such a belief. The state-of-the-world model at any time might look like: stuff arranged in Euclidean space, extending to infinity in all directions; an assortment of macroscopic objects in the immediate vicinity, with specified approximate shapes, textures, and other properties; a belief that certain objects are made of specific substances, implying that they are made of a population of specific molecular entities presenting a particular thermodynamic profile; and finally, should the AI find itself interacting directly with a specific microscopic entity, that will be modeled in terms consistent with fundamental physics. The rules for updating the concrete world-model at different levels would encompass, not just positing the existence of greater detail when that becomes necessary, but also forgetting low-level physical details once they became irrelevant.
As it interacts with the world and infers more and more about the specific objects existing in the world, I see every reason for it to eventually posit its own existence—of course, it won’t know or think “that is me”; but there will be an object in the world-model which is the representation of itself. (Eventually, there might even be an object in the world-model representing its own self-representation...)
Model 2 doesn’t have this hole, it includes description of the agent as part of the world. However, belief in it doesn’t have any effect on A’s decisions, so this point is of unclear relevance.