Regarding the question of formalizing an optimization agent with goals defined in terms of external universe rather than sensory input. It is possible to attack the problem by generalizing the framework I described in http://lesswrong.com/lw/gex/save_the_princess_a_tale_of_aixi_and_utility/8ekk for solving the duality problem. Specifically, consider an “initial guess” stochastic model of the universe including the machine on which our agent is running. I call it the “innate model” M. Now consider a stochastic process with the same degrees of freedom as M but governed by the Solomonoff semi-measure. This is the “unbiased model” S. The two can be combined by assigning transition probabilities proportional to the product of the probabilities assigned by M and S. If M is sufficiently “insecure” (in particular it doesn’t assign 0 to any transition probability) then the resulting model S’, considered as prior, allows arriving at any computable model after sufficient learning. Fix a utility function on the space of histories of our model (note that the histories include both intrinsic and extrinsic degrees of freedom). The intelligence I(A) of any given agent A (= program written in M in the initial state) can now be defined to be the expected utility of A in S’. We can now consider optimal or near-optimal agents in this sense (as opposed to the Legg-Hutter formalism for measuring intelligence, there is no guarantee there is a maximum rather than a supremum; unless of course we limit the length of the programs we consider). This is a generalization of the Legg-Hutter formalism which accounts for limited computational resources, solves the duality problem (such agents take into account possibly wireheading) and also provides a solution for the ontology problem. This is essentially a special case of the Orseau-Ring framework. It is however much more specific than Orseau-Ring where the prior is left completely unspecified. You can think of it as a recipe for constructing Orseau-Ring priors from realistic problems
I realized that although the idea of a deformed Solomonoff semi-measure is correct, the multiplication prescription I suggested is rather ad hoc. The following construction is a much more natural and justifiable way of combining M and S.
Fix t0 a time parameter. Consider a stochastic process S(-t0) that begins at time t = -t0, where t = 0 is the time our agent A “forms”, governed by the Solomonoff semi-measure. Consider another stochastic process M(-t0) that begins from the initial conditions generated by S(-t0) (I’m assuming M only carries information about dynamics and not about initial conditions). Define S’ to be the conditional probability distribution obtained from S by two conditions:
a. S and M coincide on the time interval [-t0, 0]
b. The universe contains A at time t=0
Thus t0 reflects the extent to which we are certain about M: it’s like telling the agent we have been observing behavior M for time period t0.
There is an interesting side effect to this framework, namely that A can exert “acausal” influence on the utility by affecting the initial conditions of the universe (i.e. it selects universes in which A is likely to exist). This might seem like an artifact of the model but I think it might be a legitimate effect: if we believe in one-boxing in Newcomb’s paradox, why shouldn’t we accept such acausal effects?
For models with a concept of space and finite information velocity, like cellular automata, it might make sense to limit the domain of “observed M” in space as well as time, to A’s past “light-cone”
I cannot even slightly visualize what you mean by this. Please explain how it would be used to construct an AI that made glider-oids in a Life-like cellular automaton universe.
Is the AI hardware separate from the cellular automaton or is it a part of it? Assuming the latter, we need to decide which degrees of freedom of the cellular automaton form the program of our AI. For example we can select a finite set of cells and allow setting their values arbitrarily. Then we need to specify our utility function. For example it can be a weighted sum of the number of gliders at different moments of time, or a maximum or whatever. However we need to make sure the expectation values converge. Then the “AI” is simply the assignment of values to the selected cells in the initial state which yields the maximal expect utility. Note though that if we’re sure about the law governing the cellular automaton then there’s no reason to use the Solomonoff semi-measure at all (except maybe as a prior for the initial state outside the selected cells). However if our idea of the way the cellular automaton works is only an “initial guess” then the expectation value is evaluated w.r.t. a stochastic process governed by a “deformed Solomonoff” semi-measure in which transitions illegal w.r.t. assumed cellular automaton law are suppressed by some factor 0 < p < 1 w.r.t. “pure” Solomonoff inference. Note that, contrary to the case of AIXI, I can only describe the measure of intelligence, I cannot constructively describe the agent maximizing this measure. This is unsurprising since building a real (bounded computing resources) AI is a very difficult problem
Regarding the question of formalizing an optimization agent with goals defined in terms of external universe rather than sensory input. It is possible to attack the problem by generalizing the framework I described in http://lesswrong.com/lw/gex/save_the_princess_a_tale_of_aixi_and_utility/8ekk for solving the duality problem. Specifically, consider an “initial guess” stochastic model of the universe including the machine on which our agent is running. I call it the “innate model” M. Now consider a stochastic process with the same degrees of freedom as M but governed by the Solomonoff semi-measure. This is the “unbiased model” S. The two can be combined by assigning transition probabilities proportional to the product of the probabilities assigned by M and S. If M is sufficiently “insecure” (in particular it doesn’t assign 0 to any transition probability) then the resulting model S’, considered as prior, allows arriving at any computable model after sufficient learning. Fix a utility function on the space of histories of our model (note that the histories include both intrinsic and extrinsic degrees of freedom). The intelligence I(A) of any given agent A (= program written in M in the initial state) can now be defined to be the expected utility of A in S’. We can now consider optimal or near-optimal agents in this sense (as opposed to the Legg-Hutter formalism for measuring intelligence, there is no guarantee there is a maximum rather than a supremum; unless of course we limit the length of the programs we consider). This is a generalization of the Legg-Hutter formalism which accounts for limited computational resources, solves the duality problem (such agents take into account possibly wireheading) and also provides a solution for the ontology problem. This is essentially a special case of the Orseau-Ring framework. It is however much more specific than Orseau-Ring where the prior is left completely unspecified. You can think of it as a recipe for constructing Orseau-Ring priors from realistic problems
I realized that although the idea of a deformed Solomonoff semi-measure is correct, the multiplication prescription I suggested is rather ad hoc. The following construction is a much more natural and justifiable way of combining M and S.
Fix t0 a time parameter. Consider a stochastic process S(-t0) that begins at time t = -t0, where t = 0 is the time our agent A “forms”, governed by the Solomonoff semi-measure. Consider another stochastic process M(-t0) that begins from the initial conditions generated by S(-t0) (I’m assuming M only carries information about dynamics and not about initial conditions). Define S’ to be the conditional probability distribution obtained from S by two conditions:
a. S and M coincide on the time interval [-t0, 0]
b. The universe contains A at time t=0
Thus t0 reflects the extent to which we are certain about M: it’s like telling the agent we have been observing behavior M for time period t0.
There is an interesting side effect to this framework, namely that A can exert “acausal” influence on the utility by affecting the initial conditions of the universe (i.e. it selects universes in which A is likely to exist). This might seem like an artifact of the model but I think it might be a legitimate effect: if we believe in one-boxing in Newcomb’s paradox, why shouldn’t we accept such acausal effects?
For models with a concept of space and finite information velocity, like cellular automata, it might make sense to limit the domain of “observed M” in space as well as time, to A’s past “light-cone”
I cannot even slightly visualize what you mean by this. Please explain how it would be used to construct an AI that made glider-oids in a Life-like cellular automaton universe.
Is the AI hardware separate from the cellular automaton or is it a part of it? Assuming the latter, we need to decide which degrees of freedom of the cellular automaton form the program of our AI. For example we can select a finite set of cells and allow setting their values arbitrarily. Then we need to specify our utility function. For example it can be a weighted sum of the number of gliders at different moments of time, or a maximum or whatever. However we need to make sure the expectation values converge. Then the “AI” is simply the assignment of values to the selected cells in the initial state which yields the maximal expect utility. Note though that if we’re sure about the law governing the cellular automaton then there’s no reason to use the Solomonoff semi-measure at all (except maybe as a prior for the initial state outside the selected cells). However if our idea of the way the cellular automaton works is only an “initial guess” then the expectation value is evaluated w.r.t. a stochastic process governed by a “deformed Solomonoff” semi-measure in which transitions illegal w.r.t. assumed cellular automaton law are suppressed by some factor 0 < p < 1 w.r.t. “pure” Solomonoff inference. Note that, contrary to the case of AIXI, I can only describe the measure of intelligence, I cannot constructively describe the agent maximizing this measure. This is unsurprising since building a real (bounded computing resources) AI is a very difficult problem