This is actually one of the standard counterarguments against the need for friendly AI, at least against the notion that is should be an agent / be capable of acting as an agent.
I’ll try to quickly summarize the counter-counter arguments Nick Bostrom gives in Superintelligence. (In the book, AI that is not agent at all is called tool AI. AI that is an agent but cannot act as one (has no executive power in the real world) is called oracle AI.)
Some arguments have already been mentioned:
Tool AI or friendly AI without executive power cannot stop the world from building UFAI. Its abilities to prevent this and other existential risks are greatly diminished. It especially cannot guard us against the “unknown unknowns” (an oracle is not going to give answers to questions we are not asking.)
The decisions of an oracle or tool AI might look good, but actually be bad for us in ways we cannot recognize.
There is also a possibility of what Bostrom calls mind crime. If a tool or oracle AI is not inherently friendly, it might simulate sentient minds in order to give the answers to the questions that we ask; kill or possibly even torture these minds. The possibility that these simulations have moral rights is low, but there can be trillions of them, so even a low probability cannot be ignored.
Finally, it might be that the best strategy for a tool AI to give answer is to internally develop an agent-type AI that is capable of self-improvement. If the default outcome of creating a self-improving AI is doom, then the tool AI scenario might in fact be less safe.
This is actually one of the standard counterarguments against the need for friendly AI, at least against the notion that is should be an agent / be capable of acting as an agent.
I’ll try to quickly summarize the counter-counter arguments Nick Bostrom gives in Superintelligence. (In the book, AI that is not agent at all is called tool AI. AI that is an agent but cannot act as one (has no executive power in the real world) is called oracle AI.)
Some arguments have already been mentioned:
Tool AI or friendly AI without executive power cannot stop the world from building UFAI. Its abilities to prevent this and other existential risks are greatly diminished. It especially cannot guard us against the “unknown unknowns” (an oracle is not going to give answers to questions we are not asking.)
The decisions of an oracle or tool AI might look good, but actually be bad for us in ways we cannot recognize.
There is also a possibility of what Bostrom calls mind crime. If a tool or oracle AI is not inherently friendly, it might simulate sentient minds in order to give the answers to the questions that we ask; kill or possibly even torture these minds. The possibility that these simulations have moral rights is low, but there can be trillions of them, so even a low probability cannot be ignored.
Finally, it might be that the best strategy for a tool AI to give answer is to internally develop an agent-type AI that is capable of self-improvement. If the default outcome of creating a self-improving AI is doom, then the tool AI scenario might in fact be less safe.