Yudkowsky’s attempt seems to be of little practical use—as I explain in the comments there and here. It’s a combination of a not-very-useful concept and misleading terminology.
Legg’s AIQ seems to be a much more reasonable approach. Mahoney has previously done something similar.
Thanks for the links, but I don’t understand your objections—when measuring the optimisation power of a system or an AI, costs such as cycles, memory use, etc… are already implicitly included in Eliezer’s measure. If the AI spends all it’s time calculating without ever achieving anything, or if it has too little memory to complete any calculations, it will achieve an optimisation of zero.
If you attempt to quantify the “power” of an optimisation process—without any attempt to factor in the number of evaluations required, the time taken, or the resources used—the “best” algorithm is usually an exhaustive search.
I don’t see the point of calling something “optimisation power”—and then using it to award a brain-dead algorithm full marks.
I think your objection shows that you failed to read (or appreciate) this bit:
You can quantify this, at least in theory, supposing you have (A) the agent or optimization process’s preference ordering, and (B) a measure of the space of outcomes—which, for discrete outcomes in a finite space of possibilities, could just consist of counting them—then you can quantify how small a target is being hit, within how large a greater region.
No “limited resources”, just “preference ordering”.
Can you make sense of Shane Legg’s objection, then?
I would say that that the simple algorithm he describes has immense optimisation power. If there were a competitive situation, and other competent agents were trying to derail its goal, then its optimisation power drops close to zero. If your objection is that it’s wrong to define a single “optimisation power” floating platonically above the agent, then I agree.
I think your objection shows that you failed to read (or appreciate) this bit:
You can quantify this, at least in theory, supposing you have (A) the agent or optimization process’s preference ordering, and (B) a measure of the space of outcomes—which, for discrete outcomes in a finite space of possibilities, could just consist of counting them—then you can quantify how small a target is being hit, within how large a greater region.
His very next paragraph is:
Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank. Dividing this by the total size of the space gives you the relative smallness of the target—did you hit an outcome that was one in a million? One in a trillion?
“outcome achieved”. Hence the optimisation is measuring how effective the agent is at implementing its agenda. An agent that didn’t have the ressources to think well or fast enough would score low, because it wouldn’t implement anything.
Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank. Dividing this by the total size of the space gives you the relative smallness of the target—did you hit an outcome that was one in a million? One in a trillion?
“outcome achieved”. Hence the optimisation is measuring how effective the agent is at implementing its agenda. An agent that didn’t have the ressources to think well or fast enough would score low, because it wouldn’t implement anything.
The article talks about “preference ordering”. There’s no mention of how long the preference ordering takes to output. Resource constraints are missing from the whole article. It’s optimisation power without consideration of resource limitation. Exhaustive search wins that contest—with the highest possible score.
Even if you factor in resource constraints (for example by imposing time and resource limits) - this is still a per-problem metric—while the term “optimisation power” suggests some more general capabilities.
“outcome achieved”, “did you hit an outcome”, “optimization processes produce surprises”, “relative improbability of ‘equally good or better’ outcomes”—he’s talking about the outcome produced (and then using the preference orderings to measure optimisation power given that that outcome was produced).
The time taken is not explicitly modeled, but is indirectly: exhaustive search only wins if the agent really has all the time in the world to implement its plans. An AI due to get smashed in a year if it doesn’t produce anything will have an optimisation of zero if it uses exhaustive search.
I haven’t decided whether the idea is good or bad yet—I haven’t yet evaluated it properly.
But as far as I can tell, your objection to it is incorrect. A naive search program would have very low optimisation power by Eliezer’s criteria—is there a flaw in my argument?
Essentially I agree that that particular objection is largely ineffectual. It is possible to build resource constraints into the environment if you like—though usually resource constraints are at least partly to do with the agent.
Resource constraints need to be specified somewhere. Otherwise exhaustive search (10 mins) gets one score and exhaustive search (10 years) gets another score—and the metric isn’t well defined.
If you see the optimisation score as being attached to a particular system (agent+code+hardware+power available), then there isn’t a problem. It’s only if you want to talk about the optimisation power of an algorithm in a platonic sense, that the definition fails.
Essentially I agree that that particular objection is largely ineffectual.
Upvoted because admitting to error is rare and admirable, even on Less Wrong :-)
Two new formal definitions of intelligence are presented, the ”pragmatic general intelligence” and ”efficient pragmatic general intelligence.” Largely inspired by Legg and Hutter’s formal definition of ”universal intelligence,” the goal of these definitions is to capture a notion of general intelligence that more closely models that possessed by humans and practical AI systems, which combine an element of universality with a certain degree of specialization to particular environments and goals. Pragmatic general intelligence measures the capability of an agent to achieve goals in environments, relative to prior distributions over goal and environment space. Efficient pragmatic general intelligences measures this same capability, but normalized by the amount of computational resources utilized in the course of the goal-achievement. A methodology is described for estimating these theoretical quantities based on observations of a real biological or artificial system operating in a real environment. Finally, a measure of the ”degree of generality” of an intelligent system is presented, allowing a rigorous distinction between ”general AI” and ”narrow AI.”
Yudkowsky’s attempt seems to be of little practical use—as I explain in the comments there and here. It’s a combination of a not-very-useful concept and misleading terminology.
Legg’s AIQ seems to be a much more reasonable approach. Mahoney has previously done something similar.
I discussed a third group’s attempt: http://lesswrong.com/lw/42t/aixistyle_iq_tests/
Thanks for the links, but I don’t understand your objections—when measuring the optimisation power of a system or an AI, costs such as cycles, memory use, etc… are already implicitly included in Eliezer’s measure. If the AI spends all it’s time calculating without ever achieving anything, or if it has too little memory to complete any calculations, it will achieve an optimisation of zero.
Can you make sense of Shane Legg’s objection, then?
One of my criticisms was this:
I don’t see the point of calling something “optimisation power”—and then using it to award a brain-dead algorithm full marks.
I think your objection shows that you failed to read (or appreciate) this bit:
No “limited resources”, just “preference ordering”.
I would say that that the simple algorithm he describes has immense optimisation power. If there were a competitive situation, and other competent agents were trying to derail its goal, then its optimisation power drops close to zero. If your objection is that it’s wrong to define a single “optimisation power” floating platonically above the agent, then I agree.
His very next paragraph is:
“outcome achieved”. Hence the optimisation is measuring how effective the agent is at implementing its agenda. An agent that didn’t have the ressources to think well or fast enough would score low, because it wouldn’t implement anything.
The article talks about “preference ordering”. There’s no mention of how long the preference ordering takes to output. Resource constraints are missing from the whole article. It’s optimisation power without consideration of resource limitation. Exhaustive search wins that contest—with the highest possible score.
Even if you factor in resource constraints (for example by imposing time and resource limits) - this is still a per-problem metric—while the term “optimisation power” suggests some more general capabilities.
“outcome achieved”, “did you hit an outcome”, “optimization processes produce surprises”, “relative improbability of ‘equally good or better’ outcomes”—he’s talking about the outcome produced (and then using the preference orderings to measure optimisation power given that that outcome was produced).
The time taken is not explicitly modeled, but is indirectly: exhaustive search only wins if the agent really has all the time in the world to implement its plans. An AI due to get smashed in a year if it doesn’t produce anything will have an optimisation of zero if it uses exhaustive search.
“An optimisation of zero” ?!?
Are you objecting to the phrasing or to the point?
That was the terminology—though it isn’t just the terminology that is busted here.
Frankly, I find it hard to believe that you seem to be taking this idea seriously.
I haven’t decided whether the idea is good or bad yet—I haven’t yet evaluated it properly.
But as far as I can tell, your objection to it is incorrect. A naive search program would have very low optimisation power by Eliezer’s criteria—is there a flaw in my argument?
Essentially I agree that that particular objection is largely ineffectual. It is possible to build resource constraints into the environment if you like—though usually resource constraints are at least partly to do with the agent.
Resource constraints need to be specified somewhere. Otherwise exhaustive search (10 mins) gets one score and exhaustive search (10 years) gets another score—and the metric isn’t well defined.
If you see the optimisation score as being attached to a particular system (agent+code+hardware+power available), then there isn’t a problem. It’s only if you want to talk about the optimisation power of an algorithm in a platonic sense, that the definition fails.
Upvoted because admitting to error is rare and admirable, even on Less Wrong :-)
Related to Legg’s work: Ben Goertzel’s paper “Toward a Formal Characterization of Real-World General Intelligence ”
Abstract: