Edit: There was an old discussion between Holden Karnofsky + Jaan Tallinn on Tool AI in yahoo groups, but Yahoo Groups has been deprecated. Here’s the page in the wayback machine, but the attachment is not available. I would appreciate someone here leaving a link to that old document, I recall it being quite thoughtful.
After some more reading, particularly the Drexler CAIS report, I realize I was more confused than I thought about Tool vs Agent AI. I think I’ve resolved it, but I’d appreciate feedback. Would the below be correct?
“Most sophisticated software behaves like both a Tool and an Agent, at different times. Google Maps reports possible routes like a tool, but it searches for paths to maximize a utility function like an agent. DeepMind might ultimately select the move that maximizes its winning probability, but it follows some set rules in how it frames and conducts this search. Only relatively simple programs could be called purely Tool: Atari Pong genuinely runs no search function. Any advanced AI, while it could be Tool in the sense of not taking actions that impact the outside world, is likely to be Agent in the sense of optimizing within bounds internally.”
Any advanced AI, while it could be Tool in the sense of not taking actions that impact the outside world, is likely to be Agent in the sense of optimizing within bounds internally.”
Out of the two implicit definitions of agent: “maximises UF” and “affects outside world (without explicit being told to)”, the second is the only one that is relevant to AI safety, and the one that is used by the actual AI community. IOW, trying to bring in the first definition just causes confusion.
… it seems like Oracle AIs could be a useful stepping stone on the path toward safe, freely acting AGIs. However, because any Oracle AI can be relatively easily turned into a free-acting AGI and because many people will have an incentive to do so, Oracle AIs are not by themselves a solution to AGI risk, even if they are safer than free-acting AGIs when kept as pure oracles.
You mention having looked through the literature; in case you missed any, here’s what I think of as the standard resources on this topic.
Gwern’s analysis of why Tool AIs want to become Agent AIs. It contains references to many other works. (It also has a positive review from Eliezer.)
Eric Drexler’s incredibly long report on Comprehensive AI Services as General Intelligence. Related: Rohin Shah’s summary, and Richard Ngo’s comment.
The Goals and Utility Functions chapter of Rohin Shah’s Value Learning sequence.
Eliezer’s very well-written post on Arbital addressing the question “Why Expected Utility?” [Added: crossposted to LessWrong here]
The LessWrong wiki article on Tool AI links to several posts on this topic.
All are very worth reading.
Edit: There was an old discussion between Holden Karnofsky + Jaan Tallinn on Tool AI in yahoo groups, but Yahoo Groups has been deprecated. Here’s the page in the wayback machine, but the attachment is not available. I would appreciate someone here leaving a link to that old document, I recall it being quite thoughtful.
After some more reading, particularly the Drexler CAIS report, I realize I was more confused than I thought about Tool vs Agent AI. I think I’ve resolved it, but I’d appreciate feedback. Would the below be correct?
“Most sophisticated software behaves like both a Tool and an Agent, at different times. Google Maps reports possible routes like a tool, but it searches for paths to maximize a utility function like an agent. DeepMind might ultimately select the move that maximizes its winning probability, but it follows some set rules in how it frames and conducts this search. Only relatively simple programs could be called purely Tool: Atari Pong genuinely runs no search function. Any advanced AI, while it could be Tool in the sense of not taking actions that impact the outside world, is likely to be Agent in the sense of optimizing within bounds internally.”
Out of the two implicit definitions of agent: “maximises UF” and “affects outside world (without explicit being told to)”, the second is the only one that is relevant to AI safety, and the one that is used by the actual AI community. IOW, trying to bring in the first definition just causes confusion.
Didn’t realize that, but it makes complete sense. Thanks.
Also, we discussed Tool AI as a subcategory of Oracle AI in section 5.1. of Responses to Catastrophic AGI Risk; our conclusion:
Eric Drexler’s report on comprehensive AI services also contains relevant readings. Here is Rohin’s summary of it.
Thanks, this example was so big and recent that I forgot it. Have added it to my answer.
Thanks to all! Very useful reading, particularly Gwern.
Jaan/Holden convo link is broken :(
https://gwern.net/doc/existential-risk/2011-05-10-givewell-holdenkarnofskyjaantallinn.doc