When Pei Wang makes an argument, it is intuitively clear and does not require going through a complex chain of reasoning […]
I felt the main reason was anthropomorphism:
Such a computer system will share many properties with the human mind; […]
If intelligence turns out to be adaptive (as believed by me and many others), then a “friendly AI” will be mainly the result of proper education, not proper design.
Note that I don’t want to accuse Pei Wang of anthropomorphism. My point is, his choice of words appeal to our anthropomorphism, which is highly intuitive. Another example of an highly intuitive, but not very helpful sentence:
It is my belief that an AGI will necessarily be adaptive, which implies that the goals it actively pursues constantly change as a function of its experience, and are not fully restricted by its initial (given) goals.
Intuitive, because applied to humans, we can easily see that we can change plans according to experience. Like apply for a PhD, then dropping out when finding out you don’t enjoy it after all. You can abandon the goal of making research, and have a new goal of, say, practicing and teaching surfing.
Not very helpful, because the split between initial goals and later goals does not help you build an AI that will actually do something “good”. Here, the split between instrumental goals (means to an end), and terminal goals (the AI’s “ulterior motives”) is more important. To give a human example, in the case above, doing research or surfing are both means to the same end (like being happy, or something more careful but so complicated nobody knows how to clearly specify it yet). For an AI, as Pei Wang implies, the initial goals aren’t necessarily supposed to constraint all future goals. But its terminal goals are indeed supposed to constrain the instrumental goals it will form later. (More precisely, the instrumental goals are supposed to follow from the terminal goals and the AI’s current model of the world.)
Edit: it just occurred to me, that terminal goals have somehow to be encoded into the AI before we set it loose. They are necessarily initial goals (if they aren’t, the AI is by definition unfriendly —not a problem if its goals miraculously converge towards something “good”, though). Thinking about it, it looks like Pei Wang doesn’t believe it is possible to make an AI with stable terminal goals.
I felt the main reason was anthropomorphism:
Note that I don’t want to accuse Pei Wang of anthropomorphism. My point is, his choice of words appeal to our anthropomorphism, which is highly intuitive. Another example of an highly intuitive, but not very helpful sentence:
Intuitive, because applied to humans, we can easily see that we can change plans according to experience. Like apply for a PhD, then dropping out when finding out you don’t enjoy it after all. You can abandon the goal of making research, and have a new goal of, say, practicing and teaching surfing.
Not very helpful, because the split between initial goals and later goals does not help you build an AI that will actually do something “good”. Here, the split between instrumental goals (means to an end), and terminal goals (the AI’s “ulterior motives”) is more important. To give a human example, in the case above, doing research or surfing are both means to the same end (like being happy, or something more careful but so complicated nobody knows how to clearly specify it yet). For an AI, as Pei Wang implies, the initial goals aren’t necessarily supposed to constraint all future goals. But its terminal goals are indeed supposed to constrain the instrumental goals it will form later. (More precisely, the instrumental goals are supposed to follow from the terminal goals and the AI’s current model of the world.)
Edit: it just occurred to me, that terminal goals have somehow to be encoded into the AI before we set it loose. They are necessarily initial goals (if they aren’t, the AI is by definition unfriendly —not a problem if its goals miraculously converge towards something “good”, though). Thinking about it, it looks like Pei Wang doesn’t believe it is possible to make an AI with stable terminal goals.
Excellent Freudian slip there.
Corrected, thanks.