I think the best argument against institution-oriented work is that it might be harder to make a big impact. But more importantly, I think strong global coordination is necessary and sufficient, whereas technical safety is plausibly neither.
I also agree that one should consider tradeoffs, sometimes. But every time someone has raised this concern to me (I think it’s been 3x?) I think it’s been a clear cut case of “why are you even worrying about that”, which leads me to believe that there are a lot of people who are overconcerned about this.
I would have said that strong global coordination before we get to AGI isn’t necessary. I’d also have said that strong global coordination without an alignment solution is insufficient, given that it’s not realistic to shoot for levels of coordination like “let’s just never build AGI”. (My model of Nate would also add here that never building AGI would mean losing an incredible amount of cosmopolitan value, enough to count as an existential catastrophe in its own right.)
Maybe we could start with you saying why you think it’s necessary and sufficient? That might give me a better understanding of what you have in mind by “institution-oriented work”.
I also agree that one should consider tradeoffs, sometimes. But every time someone has raised this concern to me (I think it’s been 3x?) I think it’s been a clear cut case of “why are you even worrying about that”, which leads me to believe that there are a lot of people who are overconcerned about this.
I wouldn’t be at all surprised if lots of people are overconcerned about this. Many people are also underconcerned, though. I feel better about public advice that encourages people to test their models of the size of relevant drops and relevant buckets, rather than just trying to correct for a bias some people have in a particular direction (which makes overcorrection easy).
I feel better about public advice that encourages people to test their models of the size of relevant drops and relevant buckets, rather than just trying to correct for a bias some people have in a particular direction (which makes overcorrection easy).
Differential research that advances safety more than AI capability still advances AI capability.
Which seems to suggest that advancing AI capability is sufficient reason to avoid technical safety that has non-trivial overlap with capabilities. I think that’s wrong.
RE the necessary and sufficient argument:
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
2) Sufficient: I agree that never building AGI is a potential Xrisk (or close enough). I don’t think it’s entirely unrealistic “to shoot for levels of coordination like ‘let’s just never build AGI’”, although I agree it’s a long shot. Supposing we have that level of coordination, we could use “never build AGI” as a backup plan while we work to solve technical safety to our satisfaction, if that is in fact possible.
Yeah, I agree with that; my above suggestion is taking into account that this is a likely case of overconcern.
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
This sounds weaker to me than what I usually think of as a “necessary and sufficient” condition.
My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
It doesn’t matter how good their intentions are. It doesn’t matter if they don’t want to enact a Hollywood movie plot. They don’t know how to do it. Nobody knows how to do it. There’s no point in even talking about the arms race if the arms race is between a set of unfriendly AIs with no friendly AI in the mix.
And the one in the background when he says a competitive AGI project can’t deal with large slowdowns:
Because I don’t think you can get the latter degree of advantage over other AGI projects elsewhere in the world. Unless you are postulating massive global perfect surveillance schemes that don’t wreck humanity’s future, carried out by hyper-competent, hyper-trustworthy great powers with a deep commitment to cosmopolitan value — very unlike the observed characteristics of present great powers, and going unopposed by any other major government.
I would say that actually solving the technical problem clearly is necessary for good outcomes, whereas strong pre-AGI global coordination is helpful but not necessary. And the scenario where a leading AI company just builds sufficiently aligned AGI, runs it, and saves the world doesn’t strike me as particularly implausible, relative to other ‘things turn out alright’ outcomes; whereas the scenario where world leaders like Trump, Putin, and Xi Jinping usher in a permanent otherwise-utopian AGI-free world government does strike me as much crazier than the ten or hundred likeliest ‘things turn out alright’ scenarios.
In general, better coordination reduces the difficulty of the relevant technical challenges, and technical progress reduces the difficulty of the relevant coordination challenges; so both are worth pursuing. I do think that (e.g.) reducing x-risk by 5% with coordination work is likely to be much more difficult than reducing it by 5% with technical work, and I think the necessity and sufficiency arguments are much weaker for ‘just try to get everyone to be friends’ approaches than for ‘just try to figure out how to build this kind of machine’ approaches.
My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
There are probably no fire alarms for “nice AI designs” either, just like there are no fire alarms for AI in general.
Why should we expect people to share “nice AI designs”?
Can you give some arguments for these views?
I think the best argument against institution-oriented work is that it might be harder to make a big impact. But more importantly, I think strong global coordination is necessary and sufficient, whereas technical safety is plausibly neither.
I also agree that one should consider tradeoffs, sometimes. But every time someone has raised this concern to me (I think it’s been 3x?) I think it’s been a clear cut case of “why are you even worrying about that”, which leads me to believe that there are a lot of people who are overconcerned about this.
I would have said that strong global coordination before we get to AGI isn’t necessary. I’d also have said that strong global coordination without an alignment solution is insufficient, given that it’s not realistic to shoot for levels of coordination like “let’s just never build AGI”. (My model of Nate would also add here that never building AGI would mean losing an incredible amount of cosmopolitan value, enough to count as an existential catastrophe in its own right.)
Maybe we could start with you saying why you think it’s necessary and sufficient? That might give me a better understanding of what you have in mind by “institution-oriented work”.
I wouldn’t be at all surprised if lots of people are overconcerned about this. Many people are also underconcerned, though. I feel better about public advice that encourages people to test their models of the size of relevant drops and relevant buckets, rather than just trying to correct for a bias some people have in a particular direction (which makes overcorrection easy).
I like this sentence a lot.
So my original response was to the statement:
Which seems to suggest that advancing AI capability is sufficient reason to avoid technical safety that has non-trivial overlap with capabilities. I think that’s wrong.
RE the necessary and sufficient argument:
1) Necessary: it’s unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough
2) Sufficient: I agree that never building AGI is a potential Xrisk (or close enough). I don’t think it’s entirely unrealistic “to shoot for levels of coordination like ‘let’s just never build AGI’”, although I agree it’s a long shot. Supposing we have that level of coordination, we could use “never build AGI” as a backup plan while we work to solve technical safety to our satisfaction, if that is in fact possible.
Yeah, I agree with that; my above suggestion is taking into account that this is a likely case of overconcern.
This sounds weaker to me than what I usually think of as a “necessary and sufficient” condition.
My view is more or less the one Eliezer points to here:
And the one in the background when he says a competitive AGI project can’t deal with large slowdowns:
I would say that actually solving the technical problem clearly is necessary for good outcomes, whereas strong pre-AGI global coordination is helpful but not necessary. And the scenario where a leading AI company just builds sufficiently aligned AGI, runs it, and saves the world doesn’t strike me as particularly implausible, relative to other ‘things turn out alright’ outcomes; whereas the scenario where world leaders like Trump, Putin, and Xi Jinping usher in a permanent otherwise-utopian AGI-free world government does strike me as much crazier than the ten or hundred likeliest ‘things turn out alright’ scenarios.
In general, better coordination reduces the difficulty of the relevant technical challenges, and technical progress reduces the difficulty of the relevant coordination challenges; so both are worth pursuing. I do think that (e.g.) reducing x-risk by 5% with coordination work is likely to be much more difficult than reducing it by 5% with technical work, and I think the necessity and sufficiency arguments are much weaker for ‘just try to get everyone to be friends’ approaches than for ‘just try to figure out how to build this kind of machine’ approaches.
There are probably no fire alarms for “nice AI designs” either, just like there are no fire alarms for AI in general.
Why should we expect people to share “nice AI designs”?