Thanks for the comments! I’ll try to answer briefly.
It’s useful to think of the task as one of defining the intended intelligence metric. The Legg-Hutter metric can’t represent a heating-up game, because there isn’t a “heat” channel from the agent to the environment. Is it possible that an agent with high Legg-Hutter intelligence might be able to succeed on the heating up game, given a heat channel? Yes, this is possible. But AIXItl would almost certainly not be able to do this (it cannot consider limiting computation for a few timesteps), and you shouldn’t expect agents with high LH score to do this, because this isn’t what LH measures. Embedding an agent with high LH score in a heating up game violates an assumption under which the agent was shown to behave well. The problem here is not this one game in particular, the problem is that we still don’t know how to define the actual (naturalized) intelligence metric we care about. If we could formalize a set of universes and an embedding rule that allows us to measure agents on problems where their physical embodiment matters, that would constitute progress.
You’re correct that the agent ultimately needs to choose based purely on information from its observations (well, that and the priors), but there’s a difference between agents that are attempting to optimize what they see, and agents that are attempting to optimize what actually happened. Yes, the latter is ultimately an observation-based decision process, but it’s a fairly complicated one (note the difficulty of cashing out the word “actually” and the need to worry about the agent’s beliefs and their accuracy). The problem of ontology identification is not one of avoiding the fact that the agent must decide based on observations “alone”, it’s one of figuring out how to build agents that do the specific type of observation-based decision that we prefer (e.g. optimizing for reality rather than sense data). The real question, after all, is “we want agents that optimize actual reality, how do we build them?”—this requires cashing out some parts of the question that are glossed over if you define environments as “a thing that spits out an observation and a reward in each timestep.”
With regards to your suggestion of a metric which allows the value function to vary, this is all well and good, but now how do I find the V that actually corresponds to my goals? Say I want the V which scores the agent well for maximizing diamond in reality. This requires specifying a function which (1) takes observations; (2) uses them along with priors and knowledge about how the agent behaves to compute the expected state of outside reality; and (3) computes how much diamond is actually in reality and scores accordingly. But that’s not a value function, that’s most of an AGI!
It’s fine to say that the utility function must ultimately be defined over percepts, but in order to give me the function over percepts that I actually want (e.g. one that figures out how reality looks and scores an agent for maximizing it appropriately), I need a value function which turns percepts into a world model, figures out what the agent is going to do, and solves the ontology identification problem in order to rate the resulting world history. A huge part of the problem of intelligence is figuring out how to define a function of percepts which optimizes goals in actual reality—so while you’re welcome to think of ontology identification as part of “picking the right value function”, you eventually have to unpack that process, and the ontology identification problem is one of the hurdles that arises when you try to do so.
Thanks for the comments! I’ll try to answer briefly.
It’s useful to think of the task as one of defining the intended intelligence metric. The Legg-Hutter metric can’t represent a heating-up game, because there isn’t a “heat” channel from the agent to the environment. Is it possible that an agent with high Legg-Hutter intelligence might be able to succeed on the heating up game, given a heat channel? Yes, this is possible. But AIXItl would almost certainly not be able to do this (it cannot consider limiting computation for a few timesteps), and you shouldn’t expect agents with high LH score to do this, because this isn’t what LH measures. Embedding an agent with high LH score in a heating up game violates an assumption under which the agent was shown to behave well. The problem here is not this one game in particular, the problem is that we still don’t know how to define the actual (naturalized) intelligence metric we care about. If we could formalize a set of universes and an embedding rule that allows us to measure agents on problems where their physical embodiment matters, that would constitute progress.
You’re correct that the agent ultimately needs to choose based purely on information from its observations (well, that and the priors), but there’s a difference between agents that are attempting to optimize what they see, and agents that are attempting to optimize what actually happened. Yes, the latter is ultimately an observation-based decision process, but it’s a fairly complicated one (note the difficulty of cashing out the word “actually” and the need to worry about the agent’s beliefs and their accuracy). The problem of ontology identification is not one of avoiding the fact that the agent must decide based on observations “alone”, it’s one of figuring out how to build agents that do the specific type of observation-based decision that we prefer (e.g. optimizing for reality rather than sense data). The real question, after all, is “we want agents that optimize actual reality, how do we build them?”—this requires cashing out some parts of the question that are glossed over if you define environments as “a thing that spits out an observation and a reward in each timestep.”
With regards to your suggestion of a metric which allows the value function to vary, this is all well and good, but now how do I find the V that actually corresponds to my goals? Say I want the V which scores the agent well for maximizing diamond in reality. This requires specifying a function which (1) takes observations; (2) uses them along with priors and knowledge about how the agent behaves to compute the expected state of outside reality; and (3) computes how much diamond is actually in reality and scores accordingly. But that’s not a value function, that’s most of an AGI!
It’s fine to say that the utility function must ultimately be defined over percepts, but in order to give me the function over percepts that I actually want (e.g. one that figures out how reality looks and scores an agent for maximizing it appropriately), I need a value function which turns percepts into a world model, figures out what the agent is going to do, and solves the ontology identification problem in order to rate the resulting world history. A huge part of the problem of intelligence is figuring out how to define a function of percepts which optimizes goals in actual reality—so while you’re welcome to think of ontology identification as part of “picking the right value function”, you eventually have to unpack that process, and the ontology identification problem is one of the hurdles that arises when you try to do so.
I hope that helps!