Differential research that advances safety more than AI capability still advances AI capability.
Which seems to suggest that advancing AI capability is sufficient reason to avoid technical safety that has non-trivial overlap with capabilities. I think that’s wrong.
RE the necessary and sufficient argument:
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
2) Sufficient: I agree that never building AGI is a potential Xrisk (or close enough). I don’t think it’s entirely unrealistic “to shoot for levels of coordination like ‘let’s just never build AGI’”, although I agree it’s a long shot. Supposing we have that level of coordination, we could use “never build AGI” as a backup plan while we work to solve technical safety to our satisfaction, if that is in fact possible.
Yeah, I agree with that; my above suggestion is taking into account that this is a likely case of overconcern.
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
This sounds weaker to me than what I usually think of as a “necessary and sufficient” condition.
My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
It doesn’t matter how good their intentions are. It doesn’t matter if they don’t want to enact a Hollywood movie plot. They don’t know how to do it. Nobody knows how to do it. There’s no point in even talking about the arms race if the arms race is between a set of unfriendly AIs with no friendly AI in the mix.
And the one in the background when he says a competitive AGI project can’t deal with large slowdowns:
Because I don’t think you can get the latter degree of advantage over other AGI projects elsewhere in the world. Unless you are postulating massive global perfect surveillance schemes that don’t wreck humanity’s future, carried out by hyper-competent, hyper-trustworthy great powers with a deep commitment to cosmopolitan value — very unlike the observed characteristics of present great powers, and going unopposed by any other major government.
I would say that actually solving the technical problem clearly is necessary for good outcomes, whereas strong pre-AGI global coordination is helpful but not necessary. And the scenario where a leading AI company just builds sufficiently aligned AGI, runs it, and saves the world doesn’t strike me as particularly implausible, relative to other ‘things turn out alright’ outcomes; whereas the scenario where world leaders like Trump, Putin, and Xi Jinping usher in a permanent otherwise-utopian AGI-free world government does strike me as much crazier than the ten or hundred likeliest ‘things turn out alright’ scenarios.
In general, better coordination reduces the difficulty of the relevant technical challenges, and technical progress reduces the difficulty of the relevant coordination challenges; so both are worth pursuing. I do think that (e.g.) reducing x-risk by 5% with coordination work is likely to be much more difficult than reducing it by 5% with technical work, and I think the necessity and sufficiency arguments are much weaker for ‘just try to get everyone to be friends’ approaches than for ‘just try to figure out how to build this kind of machine’ approaches.
My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
There are probably no fire alarms for “nice AI designs” either, just like there are no fire alarms for AI in general.
Why should we expect people to share “nice AI designs”?
So my original response was to the statement:
Which seems to suggest that advancing AI capability is sufficient reason to avoid technical safety that has non-trivial overlap with capabilities. I think that’s wrong.
RE the necessary and sufficient argument:
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
2) Sufficient: I agree that never building AGI is a potential Xrisk (or close enough). I don’t think it’s entirely unrealistic “to shoot for levels of coordination like ‘let’s just never build AGI’”, although I agree it’s a long shot. Supposing we have that level of coordination, we could use “never build AGI” as a backup plan while we work to solve technical safety to our satisfaction, if that is in fact possible.
Yeah, I agree with that; my above suggestion is taking into account that this is a likely case of overconcern.
This sounds weaker to me than what I usually think of as a “necessary and sufficient” condition.
My view is more or less the one Eliezer points to here:
And the one in the background when he says a competitive AGI project can’t deal with large slowdowns:
I would say that actually solving the technical problem clearly is necessary for good outcomes, whereas strong pre-AGI global coordination is helpful but not necessary. And the scenario where a leading AI company just builds sufficiently aligned AGI, runs it, and saves the world doesn’t strike me as particularly implausible, relative to other ‘things turn out alright’ outcomes; whereas the scenario where world leaders like Trump, Putin, and Xi Jinping usher in a permanent otherwise-utopian AGI-free world government does strike me as much crazier than the ten or hundred likeliest ‘things turn out alright’ scenarios.
In general, better coordination reduces the difficulty of the relevant technical challenges, and technical progress reduces the difficulty of the relevant coordination challenges; so both are worth pursuing. I do think that (e.g.) reducing x-risk by 5% with coordination work is likely to be much more difficult than reducing it by 5% with technical work, and I think the necessity and sufficiency arguments are much weaker for ‘just try to get everyone to be friends’ approaches than for ‘just try to figure out how to build this kind of machine’ approaches.
There are probably no fire alarms for “nice AI designs” either, just like there are no fire alarms for AI in general.
Why should we expect people to share “nice AI designs”?