Matthew Barnett comments on Taxonomy of AI-risk counterarguments

Matthew Barnett 16 Oct 2023 23:39 UTC
4 points
0
The main arguments on this list that I mostly agree with are probably these ones:

Many AIs will be developed within a short time, leading to a multipolar situation, and they will have no special ability to coordinate with each other. The various AIs continue to work within and support the framework of the existing economy and laws, and prefer to preserve rights and property for the purpose of precedent, out of self-interest. The system successfully prevents any single AI from taking over, and humanity is protected.

and

The AI Alignment Problem will turn out to be unexpectedly easy, and we will solve it in time. Additionally, whoever is “in the lead” will have enough extra time to implement the solution without losing the lead. Race dynamics won’t mess everything up.

I have some quibbles with some of these claims, however. I don’t expect there to be a single solution to AI alignment. Rather, I expect that there will be a spectrum of approaches and best practices that work to varying degrees, with none of them being perfect. I would put less emphasis than you do on the actions taken by the actor in the lead, and would point instead to broader engineering insights, norms among labs, and regulations, when explaining why alignment might work out.

Also, I expect AIs will be able to coordinate much better than humans in the long-run. I just doubt this means all AIs will merge into a single agent, dispensing with laws. Even if AIs do merge in such a way, I doubt they would do it in a way that made humanity go extinct, since I think the value alignment part will probably prevent that.