[EY] My particular conception of an extraordinarily powerful tool AI, which would be vastly more powerful than any other conception of tool AI that anyone has considered, would secretly be an agentive AI because the difference between trying to inform the user and trying to manipulate the user is only semantic.
This is not a valid response. Holden is saying, “Here’s this vast space of possible kinds of AIs, subsumed under the term ‘tool AI’, that you should investigate.” And Eliezer is saying, “AIs within a small subset of that space would be dangerous; therefore I’m not interested in that space.”
How do you know it is a small subset? Or a subset at all? If every interestingly powerful tool AI is secretly an agent AI, that’s bad, right?
If every interestingly powerful tool AI is secretly an agent AI, that’s bad, right?
Sure. And that’s what Eliezer would have had to argue for his response to be valid. And doing so would have required, at the very least, showing that Google Maps is secretly an agent AI.
The key sentence in Eliezer’s response is, “If a planning Oracle is going to produce better solutions than humanity has yet managed to the Rubik’s Cube, it needs to be capable of doing original computer science research and writing its own code.” Eliezer’s response is only relevant to “tool AIs” of this level. Google maps is not on this level. This argument completely fails to apply to Google Maps—which supposedly motivated the repsonse—as proven by the fact that Google maps EXISTS and does not do anything like this.
Seems to me that there’s rather a large gap between “interestingly powerful” and superhuman in Eliezer’s sense. We like Google Maps because it can come up with fast, general, usually-good-enough solutions to route-planning problems, but I’m nowhere near convinced that Google Maps generates solutions that suitably trained human beings couldn’t if given the same data in a human-understandable format. Particularly not solutions that’re interesting because of their cleverness or originality or other qualities that we generally associate with organic intelligence.
On the other hand, automated theorem provers do exist, and they’ve generated some results that humans haven’t. It’s not inconceivable to me that similar systems could be applied to Rubik’s Cube (or similar) and come up with interesting results, all without doing humanlike research or rewriting their own code. Not that this is a particularly devastating argument within the narrower context of AGI.
ETA: Odd. I really didn’t expect this to be downvoted. If I’m making some obvious mistake, I’d appreciate knowing what it is.
How do you know it is a small subset? Or a subset at all? If every interestingly powerful tool AI is secretly an agent AI, that’s bad, right?
Sure. And that’s what Eliezer would have had to argue for his response to be valid. And doing so would have required, at the very least, showing that Google Maps is secretly an agent AI.
The key sentence in Eliezer’s response is, “If a planning Oracle is going to produce better solutions than humanity has yet managed to the Rubik’s Cube, it needs to be capable of doing original computer science research and writing its own code.” Eliezer’s response is only relevant to “tool AIs” of this level. Google maps is not on this level. This argument completely fails to apply to Google Maps—which supposedly motivated the repsonse—as proven by the fact that Google maps EXISTS and does not do anything like this.
Seems to me that there’s rather a large gap between “interestingly powerful” and superhuman in Eliezer’s sense. We like Google Maps because it can come up with fast, general, usually-good-enough solutions to route-planning problems, but I’m nowhere near convinced that Google Maps generates solutions that suitably trained human beings couldn’t if given the same data in a human-understandable format. Particularly not solutions that’re interesting because of their cleverness or originality or other qualities that we generally associate with organic intelligence.
On the other hand, automated theorem provers do exist, and they’ve generated some results that humans haven’t. It’s not inconceivable to me that similar systems could be applied to Rubik’s Cube (or similar) and come up with interesting results, all without doing humanlike research or rewriting their own code. Not that this is a particularly devastating argument within the narrower context of AGI.
ETA: Odd. I really didn’t expect this to be downvoted. If I’m making some obvious mistake, I’d appreciate knowing what it is.