One issue, which a lot of papers bring up, is that the way we define the distribution of environments in which to test the agent has a major impact on its measured intelligence. While sufficiently intelligent agents will perform well in any environment or distribution of environments, that’s probably of not much help since even humans don’t seem to be at that level of intelligence: we’ve evolved to deal with the kinds of environments that have historically existed on our planet, not with arbitrary ones. Legg & Veness discuss this issue:
When looking at converting the Universal Intelligence Measure into a concrete test of intelligence, a major issue is the choice of a suitable reference machine. Unfortunately, there is no such thing as a canonical universal Turing machine, and the choice that we make can have a significant impact on the test results. Very powerful agents such as AIXI will achieve high universal intelligence no matter what reference machine we choose, assuming we allow agents to train from samples prior to taking the test, as suggested in Legg and Hutter (2007). For more limited agents however, the choice of reference machine is important. Indeed, in the worst case it can cause serious problems (Hibbard, 2009). When used with typical modern reinforcement learning algorithms and a fairly natural reference machine, we expect the performance of the test to lie between these two extremes. That is, we expect that the reference machine will be important, but perhaps not so important that we will be unable to construct a useful test of machine intelligence. Providing some empirical insight into this is one of the main aims of this paper.
Before choosing a reference machine, it is worth considering, in broad terms, the effect that different reference machines will have on the intelligence measure. For example, if the reference machine is like the Lisp programming language, environments that can be compactly described using lists will be more probable. This would more heavily weight these environments in the measure, and thus if we were trying to increase the universal intelligence of an agent with respect to this particular reference machine, we would progress most rapidly if we focused our effort on our agent’s ability to deal with this class of environments. On the other hand, with a more Prolog like reference machine, environments with a logical rule structure would be more important. More generally, with a simple reference machine, learning to deal with small mathematical, abstract and logical problems would be emphasised as these environments would be the ones computed by small programs. These tests would be more like the sequence prediction and logical puzzle problems that appear in some IQ tests.
What about very complex reference machines? This would permit all kinds of strange machines, potentially causing the most likely environments to have bizarre structures. As we would like our agents to be effective in dealing with problems in the real world, if we do use a complex reference machine, it seems the best choice would be to use a machine that closely resembles the structure of the real world. Thus, the Universal Intelligence Measure would become a simulated version of reality, where the probability of encountering any given challenge would reflect its real world likelihood. Between these extremes, a moderately complex reference machine might include three dimensional space and elementary physics.
Which is why a lot of the above papers focus on trying to define useful environmental distributions.
Thx for the references! I was familiar with Hernandez-Orallo et al. (2011) and Goertzel (2010).
However, it seems that none of these papers tackles the Duality problem.
Regarding environmental distributions, I think this problem is solved rather elegantly in my approach by the quasi-Solomonoff distribution, which singles out environments compatible with the tentative model D. Essentially it is the Solomonoff prior updated by a period of observations during which D-behavior was seen.
Regarding the choice of a reference machine, its role asymptotically vanishes in the tail of the Solomonoff distribution. The quasi-Solomonoff distribution samples the tail of the Solomonoff distribution by design, the more so the more complex is D.
In applications it seems to be a good idea to use D as complex as possible (i.e. put in as much as possible information about the universe) while using a reference machine as simple as possible. In fact I would use lambda-calculus rather than a Turing machine. This is because the simpler the reference machine the closer the relation between the Solomonoff distirbution and Occam’s razor. If we assume that our intuitive grasp of simplicity is approximately correct then using a complex reference machine doesn’t make sense.
Some other discussions on formal measures of intelligence, also building on Legg & Hutter’s work:
Hernandez-Orallo & Dowe (2010) Measuring universal intelligence: Towards an anytime intelligence test
Hernandez-Orallo et al. (2011) On more realistic environment distributions for defining, evaluating and developing intelligence
Hibbard (2009) Bias and no free lunch in formal measures of intelligence
Hibbard (2011) Measuring agent intelligence via hierarchies of environments
Goertzel (2010) Toward a Formal Characterization of Real-World General Intelligence
Legg & Veness (2011) An Approximation of the Universal Intelligence Measure
Schaul et al. (2011) Measuring intelligence through games
One issue, which a lot of papers bring up, is that the way we define the distribution of environments in which to test the agent has a major impact on its measured intelligence. While sufficiently intelligent agents will perform well in any environment or distribution of environments, that’s probably of not much help since even humans don’t seem to be at that level of intelligence: we’ve evolved to deal with the kinds of environments that have historically existed on our planet, not with arbitrary ones. Legg & Veness discuss this issue:
Which is why a lot of the above papers focus on trying to define useful environmental distributions.
Thx for the references! I was familiar with Hernandez-Orallo et al. (2011) and Goertzel (2010).
However, it seems that none of these papers tackles the Duality problem.
Regarding environmental distributions, I think this problem is solved rather elegantly in my approach by the quasi-Solomonoff distribution, which singles out environments compatible with the tentative model D. Essentially it is the Solomonoff prior updated by a period of observations during which D-behavior was seen.
Regarding the choice of a reference machine, its role asymptotically vanishes in the tail of the Solomonoff distribution. The quasi-Solomonoff distribution samples the tail of the Solomonoff distribution by design, the more so the more complex is D.
In applications it seems to be a good idea to use D as complex as possible (i.e. put in as much as possible information about the universe) while using a reference machine as simple as possible. In fact I would use lambda-calculus rather than a Turing machine. This is because the simpler the reference machine the closer the relation between the Solomonoff distirbution and Occam’s razor. If we assume that our intuitive grasp of simplicity is approximately correct then using a complex reference machine doesn’t make sense.