Thanks, this is really helpful! For 1,2,4, this whole post is assuming, not arguing, that we will solve the technical problem of making safe and capable AI oracles that are not motivated to escape the box, give manipulative answers, send out radio signals with their RAM, etc. I was not making the argument that this technical problem is easy … I was not even arguing that it’s less hard than building a safe AI agent! Instead, I’m trying to counter the argument that we shouldn’t even bother trying to solve the technical problem of making safe AI oracles, because oracles are uncompetitive.
...That said, I do happen to think there are paths to making safe oracles that don’t translate into paths to making safe agents (see Self-supervised learning and AGI safety), though I don’t have terribly high confidence in that.
Can you find a link to where “Christiano dismisses Oracle AI”? I’m surprised that he has done that. After all, he coauthored “AI Safety via Debate”, which seems to addressed primarily (maybe even exclusively) at building oracles (question-answering systems). Your answer to (3) is enlightening, thank you, and do you have any sense for how widespread this view is and where it’s argued? (I edited the post to add that people going for benevolent dictator CEV AGI agents should still endorse oracle research because of the bootstrapping argument.)
Regarding the comment about Christiano, I was just referring to your quote in the last paragraph, and it seems like I misunderstood the context. Whoops.
Regarding the idea of a singleton, I mainly remember the arguments from Bostrom’s Superintelligence book and can’t quote directly. He summarizes some of the arguments here.
Thanks, this is really helpful! For 1,2,4, this whole post is assuming, not arguing, that we will solve the technical problem of making safe and capable AI oracles that are not motivated to escape the box, give manipulative answers, send out radio signals with their RAM, etc. I was not making the argument that this technical problem is easy … I was not even arguing that it’s less hard than building a safe AI agent! Instead, I’m trying to counter the argument that we shouldn’t even bother trying to solve the technical problem of making safe AI oracles, because oracles are uncompetitive.
...That said, I do happen to think there are paths to making safe oracles that don’t translate into paths to making safe agents (see Self-supervised learning and AGI safety), though I don’t have terribly high confidence in that.
Can you find a link to where “Christiano dismisses Oracle AI”? I’m surprised that he has done that. After all, he coauthored “AI Safety via Debate”, which seems to addressed primarily (maybe even exclusively) at building oracles (question-answering systems). Your answer to (3) is enlightening, thank you, and do you have any sense for how widespread this view is and where it’s argued? (I edited the post to add that people going for benevolent dictator CEV AGI agents should still endorse oracle research because of the bootstrapping argument.)
Regarding the comment about Christiano, I was just referring to your quote in the last paragraph, and it seems like I misunderstood the context. Whoops.
Regarding the idea of a singleton, I mainly remember the arguments from Bostrom’s Superintelligence book and can’t quote directly. He summarizes some of the arguments here.