(Or, is coordination easier in a long timeline?)
It seems like it would be good if the world could coordinate to not build AGI. That is, at some point in the future, when some number of teams will have the technical ability to build and deploy and AGI, but they all agree to voluntarily delay (perhaps on penalty of sanctions) until they’re confident that humanity knows how to align such a system.
Currently, this kind of coordination seems like a pretty implausible state of affairs. But I want to know if it seems like it becomes more or less plausible as time passes.
The following is my initial thinking in this area. I don’t know the relative importance of the factors that I listed, and there’s lots that I don’t understand about each of them. I would be glad for…
Additional relevant factors.
Arguments that some factor is much more important than the others.
Corrections, clarifications, or counterarguments to any of this.
Other answers to the question, that ignore my thoughts entirely.
If coordination gets harder overtime, that’s probably because...
Compute increases make developing and/or running an AGI cheaper. The most obvious consideration is that the cost of computing falls each year. If one of the bottlenecks for an AGI project is having large amounts of compute, then “having access to sufficient compute” is a gatekeeper criterion on who can build AGI. As the cost of computing continues to fall, more groups will be able to run AGI projects. The more people who can build an AGI, the harder it becomes to coordinate all of them into not deploying it.
Note that It is unclear to what degree there is currently, or will be, a hardware overhang. If someone in 2019 could already run an AGI, on only $10,000 worth of AWS, if only they knew how, then the cost of compute is not relevant to the question of coordination.
The number of relevant actors increases. If someone builds an AGI in the next year, I am reasonably confident that that someone will be Deep Mind. I expect that in 15 years, if I knew that AGI would be developed one year from then, it will be much less overdetermined which group is going to build it, because there will be many more well funded AI teams with top talent, and, most likely, none of them will have as strong a lead as Deep Mind currently appears to have.
This consideration suggests that coordination gets harder over time. However, this depends heavily on other factors (like how accepted AI safety memes are) that determine how easily Deep Mind could coordinate internally.
If coordination gets easier over time, that’s probably because…
AI safety memes become more and more pervasive and generally accepted. It seems that coordination is easier in worlds where it is uncontroversial and common knowledge that an unaligned AGI poses and existential risk, because everyone agrees that they will lose big if anyone builds an AGI.
Over the past 15 years, the key arguments of AI safety have gone from being extremely fringe, to a reasonably regarded (if somewhat controversial) position, well inside the overton window. Will this process continue? Will it be commonly accepted by ML researches in 2030, that advanced AI poses and existential threat? Will it be commonly accepted by the leaders of nation-states?
What will the perception of safety be in a world where there is another AGI winter? Suppose that narrow ML proves to be extremely useful in a large number of fields, but there’s lots of hype about AGI being right around the corner, then that bubble bursts, and there is broad disinterest in AGI again. What happens to the perception of AI safety? Is there a sense of “It looks like AI Alignment wasn’t important after all”? How cautious will researchers be in developing new AI technologies.
[Partial subpoint to the above consideration] Individual AI teams develop more serious info security conscious processes. If some team in Deep Mind discovered AGI today, and the Deep Mind leadership opted to wait to insure safety before deploying it, I don’t know how long it would be until some relevant employees left to build AGI on their own, or some other group (such as a state actor) stole their technology and deployed it.
I don’t know if this is getting better or worse, overtime.
The technologies for maintaining surveillance of would-be AGI developers improve. Coordination is made easier by technologies that aid in enforcement. If surveillance technology improves that seems like it would make coordination easier. As a special case, highly reliable lie detection or mind reading technologies would be a game-changer for making coordination easier.
Is there a reason to think that offense will beat defense in this area? Surveillance could get harder over time if the technology for detecting and defeating surveillance outpaces the technology for surveilling.
Security technology improves. Similarly, improvements in computer security (and traditional info security), would make it easier for actors to voluntarily delay deploying advanced AI technologies, because they could trust that their competitors (other companies and other nations), wouldn’t be able to steal their work.
I don’t know if this is plausible at all. My impression is that the weak point of all security systems is the people involved. What sort of advancements would make the human part of a security system more reliable?
This was a very important question that I had previously not even been thinking about – I had implicitly been assuming it was better to delay AGI. Now I’m mostly unsure, but do suspect coordination probably does get harder over time.
I am still really confused that I hadn’t really properly asked myself this question that crisply before this post came out. Like, it sure seems like a really key question.
Now, almost two years later I don’t have fully amazing answers, but I do think that this decomposition has helped me a few times since then, but I also still really want to see more work on this question.