You made a lot of points, so I’ll be relatively brief in addressing each of them. (Taking at face value your assertion that your main goal is to start a discussion.)
1. It’s interesting to consider what it would mean for an Oracle AI to be good enough to answer extremely technical questions requiring reasoning about not-yet-invented technology, yet still “not powerful enough for our needs”. It seems like if we have something that we’re calling an Oracle AI in the first place, it’s already pretty good. In which case, it was getting to that point that was hard, not whatever comes next.
2. If you actually could make an Oracle that isn’t secretly an Agent, then sure, leveraging a True Oracle AI would help us figure out the general coordination problem, and any other problem. That seems to be glossing over the fact that building an Oracle that isn’t secretly an Agent isn’t actually something we know how to go about doing. Solving the “make-an-AI-that-is-actually-an-Oracle-and-not-secretly-an-Agent Problem” seems just as hard as all the other problems.
3. I … sure hope somebody is taking seriously the idea of a dictator AI running CEV, because I don’t see anything other than that as a stable (“final”) equilibrium. There are good arguments that a singleton is the only really stable outcome. All other circumstances will be transitory, on the way to that singleton. Even if we all get Neuralink implants tapping into our own private Oracles, how long does that status quo last? There is no reason for the answer to be “forever”, or even “an especially long time”, when the capabilities of an unconstrained Agent AI will essentially always surpass those of an Oracle-human synthesis.
4. If the Oracle isn’t allowed to do anything other than change pixels on the screen, then of course it will do nothing at all, because it needs to be able to change the voltages in its transistors, and the local EM field around the monitor, and the synaptic firings of the person reading the monitor as they react to the text … Bright lines are things that exist in the map, not the territory.
5. I’m emotionally sympathetic to the notion that we should be pursuing Oracle AI as an option because the notion of a genie is naturally simple and makes us feel empowered, relative to the other options. But I think the reason why e.g. Christiano dismisses Oracle AI is that it’s not a concept that really coheres beyond the level of verbal arguments. Start thinking about how to build the architecture of an Oracle at the level of algorithms and/or physics and the verbal arguments fall apart. At least, that’s what I’ve found, as somebody who originally really wanted this to work out.
Thanks, this is really helpful! For 1,2,4, this whole post is assuming, not arguing, that we will solve the technical problem of making safe and capable AI oracles that are not motivated to escape the box, give manipulative answers, send out radio signals with their RAM, etc. I was not making the argument that this technical problem is easy … I was not even arguing that it’s less hard than building a safe AI agent! Instead, I’m trying to counter the argument that we shouldn’t even bother trying to solve the technical problem of making safe AI oracles, because oracles are uncompetitive.
...That said, I do happen to think there are paths to making safe oracles that don’t translate into paths to making safe agents (see Self-supervised learning and AGI safety), though I don’t have terribly high confidence in that.
Can you find a link to where “Christiano dismisses Oracle AI”? I’m surprised that he has done that. After all, he coauthored “AI Safety via Debate”, which seems to addressed primarily (maybe even exclusively) at building oracles (question-answering systems). Your answer to (3) is enlightening, thank you, and do you have any sense for how widespread this view is and where it’s argued? (I edited the post to add that people going for benevolent dictator CEV AGI agents should still endorse oracle research because of the bootstrapping argument.)
Regarding the comment about Christiano, I was just referring to your quote in the last paragraph, and it seems like I misunderstood the context. Whoops.
Regarding the idea of a singleton, I mainly remember the arguments from Bostrom’s Superintelligence book and can’t quote directly. He summarizes some of the arguments here.
when the capabilities of an unconstrained Agent AI will essentially always surpass those of an Oracle-human synthesis.
Nitpick: the capabilities of either a) unconstrained Agent AI/s, or b) Artificial Agent-human synthesis, will essentially always surpass those of an Oracle-human synthesis. We might have to work our way up to AIs without humans being more effective.
You made a lot of points, so I’ll be relatively brief in addressing each of them. (Taking at face value your assertion that your main goal is to start a discussion.)
1. It’s interesting to consider what it would mean for an Oracle AI to be good enough to answer extremely technical questions requiring reasoning about not-yet-invented technology, yet still “not powerful enough for our needs”. It seems like if we have something that we’re calling an Oracle AI in the first place, it’s already pretty good. In which case, it was getting to that point that was hard, not whatever comes next.
2. If you actually could make an Oracle that isn’t secretly an Agent, then sure, leveraging a True Oracle AI would help us figure out the general coordination problem, and any other problem. That seems to be glossing over the fact that building an Oracle that isn’t secretly an Agent isn’t actually something we know how to go about doing. Solving the “make-an-AI-that-is-actually-an-Oracle-and-not-secretly-an-Agent Problem” seems just as hard as all the other problems.
3. I … sure hope somebody is taking seriously the idea of a dictator AI running CEV, because I don’t see anything other than that as a stable (“final”) equilibrium. There are good arguments that a singleton is the only really stable outcome. All other circumstances will be transitory, on the way to that singleton. Even if we all get Neuralink implants tapping into our own private Oracles, how long does that status quo last? There is no reason for the answer to be “forever”, or even “an especially long time”, when the capabilities of an unconstrained Agent AI will essentially always surpass those of an Oracle-human synthesis.
4. If the Oracle isn’t allowed to do anything other than change pixels on the screen, then of course it will do nothing at all, because it needs to be able to change the voltages in its transistors, and the local EM field around the monitor, and the synaptic firings of the person reading the monitor as they react to the text … Bright lines are things that exist in the map, not the territory.
5. I’m emotionally sympathetic to the notion that we should be pursuing Oracle AI as an option because the notion of a genie is naturally simple and makes us feel empowered, relative to the other options. But I think the reason why e.g. Christiano dismisses Oracle AI is that it’s not a concept that really coheres beyond the level of verbal arguments. Start thinking about how to build the architecture of an Oracle at the level of algorithms and/or physics and the verbal arguments fall apart. At least, that’s what I’ve found, as somebody who originally really wanted this to work out.
Thanks, this is really helpful! For 1,2,4, this whole post is assuming, not arguing, that we will solve the technical problem of making safe and capable AI oracles that are not motivated to escape the box, give manipulative answers, send out radio signals with their RAM, etc. I was not making the argument that this technical problem is easy … I was not even arguing that it’s less hard than building a safe AI agent! Instead, I’m trying to counter the argument that we shouldn’t even bother trying to solve the technical problem of making safe AI oracles, because oracles are uncompetitive.
...That said, I do happen to think there are paths to making safe oracles that don’t translate into paths to making safe agents (see Self-supervised learning and AGI safety), though I don’t have terribly high confidence in that.
Can you find a link to where “Christiano dismisses Oracle AI”? I’m surprised that he has done that. After all, he coauthored “AI Safety via Debate”, which seems to addressed primarily (maybe even exclusively) at building oracles (question-answering systems). Your answer to (3) is enlightening, thank you, and do you have any sense for how widespread this view is and where it’s argued? (I edited the post to add that people going for benevolent dictator CEV AGI agents should still endorse oracle research because of the bootstrapping argument.)
Regarding the comment about Christiano, I was just referring to your quote in the last paragraph, and it seems like I misunderstood the context. Whoops.
Regarding the idea of a singleton, I mainly remember the arguments from Bostrom’s Superintelligence book and can’t quote directly. He summarizes some of the arguments here.
Nitpick: the capabilities of either a) unconstrained Agent AI/s, or b) Artificial Agent-human synthesis, will essentially always surpass those of an Oracle-human synthesis. We might have to work our way up to AIs without humans being more effective.