Regarding the definition of “intelligence”: It’s not hard to propose definitions, if you assume the framework of computer science. Consider the cognitive architecture known as an “expected-utility maximizer”. It has, to a first approximation, two parts. One part is the utility function, which offers a way of ranking the desirability of the situation which the entity finds itself in. The other part is the problem-solving part: it suggests actions, selected so as to maximize expected utility.
The utility function itself offers a way to rate the intelligence of different designs for the problem-solving component. You might, for example, average the utility obtained after a certain period of time across a set of test environments, or even across all possible environments. The point is that the EUM is supposed to be maximizing utility, and if one EUM is doing better than another, it must be because its problem-solver is more successful.
The next step towards rating the intrinsic intelligence of the problem-solving component is to compare its performance, not just across different environments, but when presented with different utility functions. Ideally, you would take into account how well it does under all possible utility functions, in all possible environments. (Since this is computer science, a “possible environment” will have a rather abstract definition, such as “any set of causally coupled finite-state machines”.)
There are issues with respect to how you average; there are issues with respect to whether the “intelligence” you get from a definition like this is actually calculable. Nonetheless, those really are just details. The point is that there are rigorous ways to rank programs with respect to their ability to solve problems, and whether or not such a ranking warrants the name “intelligence”, it is clearly of pragmatic significance. An accurate metric for the problem-solving capability of a goal-directed entity tells you how effective it will be in realizing those goals, and hence how much of an influence it can be in the world.
And this allows me to leap ahead and present a similarly informal account of what a Singularity is, what the problem of Friendliness is, and what the proposed solution is. A Singularity happens when naturally evolved intelligences, using the theory and technology of computation, create artificial intelligences of significantly superior problem-solving capability (“higher intelligence”). This superiority implies that in any conflict of goals, a higher intelligence wins against a lower intelligence (I’m speaking in general), because intelligence by definition is effectiveness in bringing about goal states. Since the goals of an artificial intelligence are thoroughly contingent (definitely so in the case of the EUM cognitive architecture), there is a risk to the natural intelligences that their own goals will be overwhelmed by those of the AIs; this is the problem of Friendliness. And the solution—at least, what I take to be the best candidate solution—is to determine the utility function (or its analogue) of the natural intelligences, determine the utility function of an ideal moral agent, relative to the preferences of that evolved utility function, and use that idealized utility function as the goal system of the artificial intelligences.
That, in schema, is my version of the research program of Eliezer’s Institute. I wanted to spell it out because I think it’s pretty easy to understand, and who knows how long it will be before Eliezer gets around to expounding it here, at length. It may be questioned from various angles; it certainly needs much more detail; but you have, right there, a formulation of what our situation is, and how to deal with it, which strikes me as eminently defensible and doable.
Regarding the definition of “intelligence”: It’s not hard to propose definitions, if you assume the framework of computer science. Consider the cognitive architecture known as an “expected-utility maximizer”. It has, to a first approximation, two parts. One part is the utility function, which offers a way of ranking the desirability of the situation which the entity finds itself in. The other part is the problem-solving part: it suggests actions, selected so as to maximize expected utility.
The utility function itself offers a way to rate the intelligence of different designs for the problem-solving component. You might, for example, average the utility obtained after a certain period of time across a set of test environments, or even across all possible environments. The point is that the EUM is supposed to be maximizing utility, and if one EUM is doing better than another, it must be because its problem-solver is more successful.
The next step towards rating the intrinsic intelligence of the problem-solving component is to compare its performance, not just across different environments, but when presented with different utility functions. Ideally, you would take into account how well it does under all possible utility functions, in all possible environments. (Since this is computer science, a “possible environment” will have a rather abstract definition, such as “any set of causally coupled finite-state machines”.)
There are issues with respect to how you average; there are issues with respect to whether the “intelligence” you get from a definition like this is actually calculable. Nonetheless, those really are just details. The point is that there are rigorous ways to rank programs with respect to their ability to solve problems, and whether or not such a ranking warrants the name “intelligence”, it is clearly of pragmatic significance. An accurate metric for the problem-solving capability of a goal-directed entity tells you how effective it will be in realizing those goals, and hence how much of an influence it can be in the world.
And this allows me to leap ahead and present a similarly informal account of what a Singularity is, what the problem of Friendliness is, and what the proposed solution is. A Singularity happens when naturally evolved intelligences, using the theory and technology of computation, create artificial intelligences of significantly superior problem-solving capability (“higher intelligence”). This superiority implies that in any conflict of goals, a higher intelligence wins against a lower intelligence (I’m speaking in general), because intelligence by definition is effectiveness in bringing about goal states. Since the goals of an artificial intelligence are thoroughly contingent (definitely so in the case of the EUM cognitive architecture), there is a risk to the natural intelligences that their own goals will be overwhelmed by those of the AIs; this is the problem of Friendliness. And the solution—at least, what I take to be the best candidate solution—is to determine the utility function (or its analogue) of the natural intelligences, determine the utility function of an ideal moral agent, relative to the preferences of that evolved utility function, and use that idealized utility function as the goal system of the artificial intelligences.
That, in schema, is my version of the research program of Eliezer’s Institute. I wanted to spell it out because I think it’s pretty easy to understand, and who knows how long it will be before Eliezer gets around to expounding it here, at length. It may be questioned from various angles; it certainly needs much more detail; but you have, right there, a formulation of what our situation is, and how to deal with it, which strikes me as eminently defensible and doable.