“Will OpenAI’s work unintentionally increase existential risks related to AI?”
“Will OpenAI’s strategy succeed at reducing existential risks related to AI?”
The point is to build in a presumption of good intentions, unless you explicitly want to challenge that presumption (which I expect you do not want to do).
David’s suggestion also seems good to me, though is asking a slightly different question and is a bit wordier.
I much prefer Rohin’s alternative version of: “Are OpenAI’s efforts to reduce existential risk counterproductive?”. The current version does feel like it screens off substantial portions of the potential risk.
Example? I legitimately struggle to imagine something covered by “Are OpenAI’s efforts to reduce existential risk counterproductive?” but not by “Will OpenAI’s work unintentionally increase existential risks related to AI?”; if anything it seems the latter covers more than the former.
One route would be if some of them thought that existential risks weren’t that much worse than major global catastrophes.
If I think that likely 10% of everyone will die because of the wrong people getting control of the killer AI drones (“slaughterbots”), and it’s important that we get to AI as quickly as possible, then we might move it forward as quickly as possible because we want to be in control, at the expense of some kinds of unlikely alignment problems. This person accepts a very small increase in the chance of existential risk via indirect AI issues at the price of a substantial decrease in the chance of 10% of humanity being wiped out via bad direct use of the AI. This would be intentionally be increasing x-risk in expectation, and they would agree.
You might correctly point out that Paul Christiano and Chris Olah don’t think like this, but I don’t really know who is involved in leadership at OpenAI, perhaps “safe” AI to some of them means “non-military”. So this is a case that the new title rules out.
I think this is a worse question now? Like, I expect OpenAI leadership explicitly thinks of themselves as increasing x-risk a bit by choosing to attempt to speed up progress to AGI.
On net they expect it‘s probably the right call, but they also probably would say “Yes, our actions are intentionally increasing the chances of x-risk in some worlds, but on net we think it’s improving things”. And then, supposing they’re wrong, and those worlds are the actual world, then they’re intentionally increasing x-risk. And now the question tells me to ignore that possibility.
The initial question made no discussion of intention, seemed better to me.
Like, I expect OpenAI leadership explicitly thinks of themselves as increasing x-risk a bit by choosing to attempt to speed up progress to AGI.
Do you think that they think they are increasing x-risk in expectation (where the expectation is according to their beliefs)? I’d find that extremely surprising (unless their reasoning is something like “yes, we raise it from 1 in a trillion to 2 in a trillion, this doesn’t matter”).
Hum, my perspective is that in the example that you describe, OpenAI isn’t intentionally increasing the risks, in that they think it improves things over all. My line at “intentionally increasing xrisks” would be to literally decide to act while thinking/knowing that your action are making things worse in general for xrisks, which doesn’t sound like your example.
“Will OpenAI’s work unintentionally increase existential risks related to AI?”
“Will OpenAI’s strategy succeed at reducing existential risks related to AI?”
The point is to build in a presumption of good intentions, unless you explicitly want to challenge that presumption (which I expect you do not want to do).
David’s suggestion also seems good to me, though is asking a slightly different question and is a bit wordier.
Done! I used your first proposal, as it is more in line with my original question.
I much prefer Rohin’s alternative version of: “Are OpenAI’s efforts to reduce existential risk counterproductive?”. The current version does feel like it screens off substantial portions of the potential risk.
Example? I legitimately struggle to imagine something covered by “Are OpenAI’s efforts to reduce existential risk counterproductive?” but not by “Will OpenAI’s work unintentionally increase existential risks related to AI?”; if anything it seems the latter covers more than the former.
One route would be if some of them thought that existential risks weren’t that much worse than major global catastrophes.
If I think that likely 10% of everyone will die because of the wrong people getting control of the killer AI drones (“slaughterbots”), and it’s important that we get to AI as quickly as possible, then we might move it forward as quickly as possible because we want to be in control, at the expense of some kinds of unlikely alignment problems. This person accepts a very small increase in the chance of existential risk via indirect AI issues at the price of a substantial decrease in the chance of 10% of humanity being wiped out via bad direct use of the AI. This would be intentionally be increasing x-risk in expectation, and they would agree.
You might correctly point out that Paul Christiano and Chris Olah don’t think like this, but I don’t really know who is involved in leadership at OpenAI, perhaps “safe” AI to some of them means “non-military”. So this is a case that the new title rules out.
Yeah, that’s a good example, thanks.
(I do think it is unlikely.)
I think this is a worse question now? Like, I expect OpenAI leadership explicitly thinks of themselves as increasing x-risk a bit by choosing to attempt to speed up progress to AGI.
On net they expect it‘s probably the right call, but they also probably would say “Yes, our actions are intentionally increasing the chances of x-risk in some worlds, but on net we think it’s improving things”. And then, supposing they’re wrong, and those worlds are the actual world, then they’re intentionally increasing x-risk. And now the question tells me to ignore that possibility.
The initial question made no discussion of intention, seemed better to me.
Do you think that they think they are increasing x-risk in expectation (where the expectation is according to their beliefs)? I’d find that extremely surprising (unless their reasoning is something like “yes, we raise it from 1 in a trillion to 2 in a trillion, this doesn’t matter”).
See my reply downthread, responding to where you asked Oli for an example.
Hum, my perspective is that in the example that you describe, OpenAI isn’t intentionally increasing the risks, in that they think it improves things over all. My line at “intentionally increasing xrisks” would be to literally decide to act while thinking/knowing that your action are making things worse in general for xrisks, which doesn’t sound like your example.