I believe I have a workable solution for the duality problem, which is essentially a special case of the Orseau-Ring framework, viewed slightly differently. Consider a specific computer architecture M, equipped with an input channel for receiving inputs ostensibly from the environment (although the environment doesn’t appear explicitly in the formalism) and possibly special instructions for self-reprogramming (although the latter is semi-redundant as will become clear in the following). This architecture has a state space Sigma (typically M is a universal Turing machine so Sigma is countable but it also possible to consider models with finite RAM in which case M is a finite state automaton), with some state transitions s1 → s2 being “legitimate” while other not (note that s1 doesn’t determine s2 uniquely since the input from the environment can be arbitrary). Consider also U a utility function defined on arbitrary (possibly “illegitimate”) infinite histories of M i.e. functions N → Sigma. Then an “agent” is simply an initial state of M: s in Sigma regarded as a “program”. The intelligence of s is defined to be its expected utility assuming the dynamics of M to be described by a certain stochastic process. If we stop here, without specifying this stochastic process, we get more or less an equivalent formulation of the Orseau-Ring framework. By analogy with Legg-Hutter it is natural to assume this stochastic process is governed by the Solomonoff semi-measure. But if we do this we probably won’t be able to get any meaningful near-optimal agents since we need to write a program without knowing how the computer works. My suggestion is deforming the Solomonoff semi-measure by assigning weight 0 < p < 1 to state transitions s1 → s2 which are illegal in terms of M. This should make the near-optimal agents sophisticated since p < 1 so they can rely to some extent on our computer architecture M. On the other hand p > 0 so these agents have to take possible wire-heading into account. In particular they can make positive use of wire-heading to reprogram themselves even if the basic architecture M doesn’t allow it, assuming of course they are placed in a universe in which it is possible.
I think you are proposing to have some hypotheses privileged in the beginning of Solomonoff induction, but not too much because the uncertainty helps fight wireheading by means of providing knowledge about the existence of an idealized, “true” utility function and world model. I that a correct summary? (Just trying to test whether I understand what you mean.)
In particular they can make positive use of wire-heading to reprogram themselves even if the basic architecture M doesn’t allow it
Yes, I think you got it more or less right. For p=0 we would just get a version of Legg-Hutter (AIXI) with limited computing resources (but duality problem preserved). For p > 0, no hypothesis is completely ruled out and the agent should be able to find the correct hypothesis given sufficient evidence, in particular it should be able to correct her assumptions regarding how her own mind works. Of course this requires the correct hypothesis to be sufficiently aligned with M’s architecture for the agent to work at all. The utility function is actually built in from the starters, however if we like we can choose it to be something like a sum of external input bits with decaying weights (in order to ensure convergence), which would be in the spirit of the Legg-Hutter “reinforcement learning” approach.
In particular the agent can discover that the true “physics” allow for reprogramming the agent, even though the initially assumed architecture M didn’t allow it. In this case she can use it to reprogram herself for her own benefit. To draw a parallel, a human can perform brain surgery on herself because of her acquired knowledge about the physics of the universe and her brain and in principle she can use it to change the functioning of her brain in ways that are incompatible with her “intuitive” initial assumptions about her own mind
There I consider a stochastic model M and here a non-deterministic model, but the same principle can be applied here. Namely, we consider a Solomonoff process starting t0 time before formation of agent A, conditioned by observance of M’s rules in the time before A’s formation and by A’s existence at time of its formation. The expected utility is computed with respect to the resulting distribution
I believe I have a workable solution for the duality problem, which is essentially a special case of the Orseau-Ring framework, viewed slightly differently. Consider a specific computer architecture M, equipped with an input channel for receiving inputs ostensibly from the environment (although the environment doesn’t appear explicitly in the formalism) and possibly special instructions for self-reprogramming (although the latter is semi-redundant as will become clear in the following). This architecture has a state space Sigma (typically M is a universal Turing machine so Sigma is countable but it also possible to consider models with finite RAM in which case M is a finite state automaton), with some state transitions s1 → s2 being “legitimate” while other not (note that s1 doesn’t determine s2 uniquely since the input from the environment can be arbitrary). Consider also U a utility function defined on arbitrary (possibly “illegitimate”) infinite histories of M i.e. functions N → Sigma. Then an “agent” is simply an initial state of M: s in Sigma regarded as a “program”. The intelligence of s is defined to be its expected utility assuming the dynamics of M to be described by a certain stochastic process. If we stop here, without specifying this stochastic process, we get more or less an equivalent formulation of the Orseau-Ring framework. By analogy with Legg-Hutter it is natural to assume this stochastic process is governed by the Solomonoff semi-measure. But if we do this we probably won’t be able to get any meaningful near-optimal agents since we need to write a program without knowing how the computer works. My suggestion is deforming the Solomonoff semi-measure by assigning weight 0 < p < 1 to state transitions s1 → s2 which are illegal in terms of M. This should make the near-optimal agents sophisticated since p < 1 so they can rely to some extent on our computer architecture M. On the other hand p > 0 so these agents have to take possible wire-heading into account. In particular they can make positive use of wire-heading to reprogram themselves even if the basic architecture M doesn’t allow it, assuming of course they are placed in a universe in which it is possible.
I think you are proposing to have some hypotheses privileged in the beginning of Solomonoff induction, but not too much because the uncertainty helps fight wireheading by means of providing knowledge about the existence of an idealized, “true” utility function and world model. I that a correct summary? (Just trying to test whether I understand what you mean.)
Can you explain this more?
Yes, I think you got it more or less right. For p=0 we would just get a version of Legg-Hutter (AIXI) with limited computing resources (but duality problem preserved). For p > 0, no hypothesis is completely ruled out and the agent should be able to find the correct hypothesis given sufficient evidence, in particular it should be able to correct her assumptions regarding how her own mind works. Of course this requires the correct hypothesis to be sufficiently aligned with M’s architecture for the agent to work at all. The utility function is actually built in from the starters, however if we like we can choose it to be something like a sum of external input bits with decaying weights (in order to ensure convergence), which would be in the spirit of the Legg-Hutter “reinforcement learning” approach.
In particular the agent can discover that the true “physics” allow for reprogramming the agent, even though the initially assumed architecture M didn’t allow it. In this case she can use it to reprogram herself for her own benefit. To draw a parallel, a human can perform brain surgery on herself because of her acquired knowledge about the physics of the universe and her brain and in principle she can use it to change the functioning of her brain in ways that are incompatible with her “intuitive” initial assumptions about her own mind
I made some improvements to the formalism, see http://lesswrong.com/lw/cze/reply_to_holden_on_tool_ai/8fjb
There I consider a stochastic model M and here a non-deterministic model, but the same principle can be applied here. Namely, we consider a Solomonoff process starting t0 time before formation of agent A, conditioned by observance of M’s rules in the time before A’s formation and by A’s existence at time of its formation. The expected utility is computed with respect to the resulting distribution