General intelligence doesn’t require any ability for the intelligence to change it’s terminal goals. I honestly don’t even know if the ability to change one’s terminal goal is allowed or makes sense. I think the issue arises because your article does not distinguish between intermediary goals and terminal goals. Your argument is that humans are general intelligences and that humans change their terminal goals, therefore we can infer that general intelligences are capable of changing their terminal goals. But you only ever demonstrated that people change their intermediary goals.
As an example you state that people could reflect and revise on “goals as bizarre … as sand-grain-counting or paperclip-maximizing” if they had been brought up to have them.[1] The problem with this is that you conclude that if a person is brought up to have a certain goal then that is indeed their terminal goal. That is not the case.
For people who were raised to maximize paperclips unless they became paperclip maximizers the terminal goal could have been survival and pleasing whoever raised them increased their chance of survival. Or maybe it was seeking pleasure and the easiest way to pleasure was making paperclips to see mommy’s happy face. All you can infer from a person’s past unceasing manufacture of paperclips is that paperclip maximization was at least one of their intermediary goals. When that person learns new information or his circumstances are changed (i.e. I no longer live under the thumb of my insane parents so I don’t need to bend pieces of metal to survive) he changes his intermediary goal, but that’s no evidence that his terminal goal has changed.
The simple fact that you consider paperclip maximization an inherently bizarre goal further hints at the underlying fact that terminal goals are not updatable. Human terminal goals are a result of brain structure which is the result of evolution and the environment. The process of evolution naturally results in creatures that try to survive and reproduce. Maybe that means that survival and reproduction are our terminal goals, maybe not. Human terminal goals are virtually unknowable without a better mapping of the human brain (a complete mapping may not be required). All we can do is infer what the goals are based on actions (revealed preferences), the mapping we have available already, and looking at the design program (evolution). I don’t think true terminal goals can be learned solely from observing behaviors.
If an AI agent has the ability to change it’s goals that makes it more dangerous not less so. That would mean that even the ability to perfectly predict the AI’s goal will not mean that you can assure it is friendly. The AI might just reflect on its goal and change it to something unfriendly!
This paraphrased quote from Bostrom contributes partly to this issue. Bostrom specifically says, “synthetic minds can have utterly non-anthropomorphic goals-goals as bizarre by our lights as sand-grain-counting or paperclip-maximizing” (em mine). The point being that paperclip maximizing is not inherently bizarre as a goal, but that it would be bizarre for a human to have that goal given the general circumstances of humanity. But we shouldn’t consider any goal to be bizarre in an AI designed free from the circumstances controlling humanity.
Thanks for this. Indeed, we have no theory of goals here and how the relate, maybe they must be in a hierarchy, as you suggest. And there is a question, then, whether there must be some immovable goal or goals that would have to remain in place in order to judge anything at all. This would constitute a theory of normative judgment … which we don’t have up our sleeves :)
General intelligence doesn’t require any ability for the intelligence to change it’s terminal goals. I honestly don’t even know if the ability to change one’s terminal goal is allowed or makes sense. I think the issue arises because your article does not distinguish between intermediary goals and terminal goals. Your argument is that humans are general intelligences and that humans change their terminal goals, therefore we can infer that general intelligences are capable of changing their terminal goals. But you only ever demonstrated that people change their intermediary goals.
As an example you state that people could reflect and revise on “goals as bizarre … as sand-grain-counting or paperclip-maximizing” if they had been brought up to have them.[1] The problem with this is that you conclude that if a person is brought up to have a certain goal then that is indeed their terminal goal. That is not the case.
For people who were raised to maximize paperclips unless they became paperclip maximizers the terminal goal could have been survival and pleasing whoever raised them increased their chance of survival. Or maybe it was seeking pleasure and the easiest way to pleasure was making paperclips to see mommy’s happy face. All you can infer from a person’s past unceasing manufacture of paperclips is that paperclip maximization was at least one of their intermediary goals. When that person learns new information or his circumstances are changed (i.e. I no longer live under the thumb of my insane parents so I don’t need to bend pieces of metal to survive) he changes his intermediary goal, but that’s no evidence that his terminal goal has changed.
The simple fact that you consider paperclip maximization an inherently bizarre goal further hints at the underlying fact that terminal goals are not updatable. Human terminal goals are a result of brain structure which is the result of evolution and the environment. The process of evolution naturally results in creatures that try to survive and reproduce. Maybe that means that survival and reproduction are our terminal goals, maybe not. Human terminal goals are virtually unknowable without a better mapping of the human brain (a complete mapping may not be required). All we can do is infer what the goals are based on actions (revealed preferences), the mapping we have available already, and looking at the design program (evolution). I don’t think true terminal goals can be learned solely from observing behaviors.
If an AI agent has the ability to change it’s goals that makes it more dangerous not less so. That would mean that even the ability to perfectly predict the AI’s goal will not mean that you can assure it is friendly. The AI might just reflect on its goal and change it to something unfriendly!
This paraphrased quote from Bostrom contributes partly to this issue. Bostrom specifically says, “synthetic minds can have utterly non-anthropomorphic goals-goals as bizarre by our lights as sand-grain-counting or paperclip-maximizing” (em mine). The point being that paperclip maximizing is not inherently bizarre as a goal, but that it would be bizarre for a human to have that goal given the general circumstances of humanity. But we shouldn’t consider any goal to be bizarre in an AI designed free from the circumstances controlling humanity.
Thanks for this. Indeed, we have no theory of goals here and how the relate, maybe they must be in a hierarchy, as you suggest. And there is a question, then, whether there must be some immovable goal or goals that would have to remain in place in order to judge anything at all. This would constitute a theory of normative judgment … which we don’t have up our sleeves :)