I’m not necessarily arguing for this position as saying we need to address it. “Suicidal AI” is to the problem of constructing FAI as anarchism is to political theory; if you want to build something (an FAI, a good government) then, on the philosophical level, you have to at least take a stab at countering the argument that perhaps it is impossible to build it.
I’m working under the assumption that we don’t really know at this point what “Friendly” means, otherwise there wouldn’t be a problem to solve. We don’t yet know what we want the AI to do.
What we do know about morality is that human beings practice it. So all our moral laws and intuitions are designed, in particular, for small, mortal creatures, living among other small, mortal creatures.
Egalitarianism, for example, only makes sense if “all men are created equal” is more or less a statement of fact. What should an egalitarian human make of a powerful AI? Is it a tyrant? Well, no, a tyrant is a human who behaves as if he’s not equal to other humans; the AI simply isn’t equal. Well, then, is the AI a good citizen? No, not really, because citizens treat each other on an equal footing...
The trouble here, I think, is that really all our notions of goodness are really “what is good for a human to do.” Perhaps you could extend them to “what is good for a Klingon to do”—but a lot of moral opinions are specifically about how to treat other people who are roughly equivalent to yourself. “Do unto others as you would have them do unto you.” The kind of rules you’d set for an AI would be fundamentally different from our rules for ourselves and each other.
It would be as if a human had a special, obsessive concern and care for an ant farm. You can protect the ants from dying. But there are lots of things you can’t do for the ants: be an ant’s friend, respect an ant, keep up your end of a bargain with an ant, treat an ant as a brother…
I had a friend once who said, “If God existed, I would be his enemy.” Couldn’t someone have the same sentiment about an AI?
(As always, I may very well be wrong on the Internet.)
You say, human values are made for agents of equal power; an AI would not be equal; so maybe the friendly thing to do is for it to delete itself. My question was, is it allowed to do just one or two positive things before it does this? I can also ask: if overwhelming power is the problem, can’t it just reduce itself to human scale? And when you think about all the things that go wrong in the world every day, then it is obvious that there is plenty for a friendly superhuman agency to do. So the whole idea that the best thing it could do is delete itself or hobble itself looks extremely dubious. If your point was that we cannot hope to figure out what friendliness should actually be, and so we just shouldn’t make superhuman agents, that would make more sense.
The comparison to government makes sense in that the power of a mature AI is imagined to be more like that of a state than that of a human individual. It is likely that once an AI had arrived at a stable conception of purpose, it would produce many, many other agents, of varying capability and lifespan, for the implementation of that purpose in the world. There might still be a central super-AI, or its progeny might operate in a completely distributed fashion. But everything would still have been determined by the initial purpose. If it was a purpose that cared nothing for life as we know it, then these derived agencies might just pave the earth and build a new machine ecology. If it was a purpose that placed a value on humans being there and living a certain sort of life, then some of them would spread out among us and interact with us accordingly. You could think of it in cultural terms: the AI sphere would have a culture, a value system, governing its interactions with us. Because of the radical contingency of programmed values, that culture might leave us alone, it might prod our affairs into taking a different shape, or it might act to swiftly and decisively transform human nature. All of these outcomes would appear to be possibilities.
I’m not necessarily arguing for this position as saying we need to address it. “Suicidal AI” is to the problem of constructing FAI as anarchism is to political theory; if you want to build something (an FAI, a good government) then, on the philosophical level, you have to at least take a stab at countering the argument that perhaps it is impossible to build it.
I’m working under the assumption that we don’t really know at this point what “Friendly” means, otherwise there wouldn’t be a problem to solve. We don’t yet know what we want the AI to do.
What we do know about morality is that human beings practice it. So all our moral laws and intuitions are designed, in particular, for small, mortal creatures, living among other small, mortal creatures.
Egalitarianism, for example, only makes sense if “all men are created equal” is more or less a statement of fact. What should an egalitarian human make of a powerful AI? Is it a tyrant? Well, no, a tyrant is a human who behaves as if he’s not equal to other humans; the AI simply isn’t equal. Well, then, is the AI a good citizen? No, not really, because citizens treat each other on an equal footing...
The trouble here, I think, is that really all our notions of goodness are really “what is good for a human to do.” Perhaps you could extend them to “what is good for a Klingon to do”—but a lot of moral opinions are specifically about how to treat other people who are roughly equivalent to yourself. “Do unto others as you would have them do unto you.” The kind of rules you’d set for an AI would be fundamentally different from our rules for ourselves and each other.
It would be as if a human had a special, obsessive concern and care for an ant farm. You can protect the ants from dying. But there are lots of things you can’t do for the ants: be an ant’s friend, respect an ant, keep up your end of a bargain with an ant, treat an ant as a brother…
I had a friend once who said, “If God existed, I would be his enemy.” Couldn’t someone have the same sentiment about an AI?
(As always, I may very well be wrong on the Internet.)
You say, human values are made for agents of equal power; an AI would not be equal; so maybe the friendly thing to do is for it to delete itself. My question was, is it allowed to do just one or two positive things before it does this? I can also ask: if overwhelming power is the problem, can’t it just reduce itself to human scale? And when you think about all the things that go wrong in the world every day, then it is obvious that there is plenty for a friendly superhuman agency to do. So the whole idea that the best thing it could do is delete itself or hobble itself looks extremely dubious. If your point was that we cannot hope to figure out what friendliness should actually be, and so we just shouldn’t make superhuman agents, that would make more sense.
The comparison to government makes sense in that the power of a mature AI is imagined to be more like that of a state than that of a human individual. It is likely that once an AI had arrived at a stable conception of purpose, it would produce many, many other agents, of varying capability and lifespan, for the implementation of that purpose in the world. There might still be a central super-AI, or its progeny might operate in a completely distributed fashion. But everything would still have been determined by the initial purpose. If it was a purpose that cared nothing for life as we know it, then these derived agencies might just pave the earth and build a new machine ecology. If it was a purpose that placed a value on humans being there and living a certain sort of life, then some of them would spread out among us and interact with us accordingly. You could think of it in cultural terms: the AI sphere would have a culture, a value system, governing its interactions with us. Because of the radical contingency of programmed values, that culture might leave us alone, it might prod our affairs into taking a different shape, or it might act to swiftly and decisively transform human nature. All of these outcomes would appear to be possibilities.