Similarly, humans are terrible at coordination compared to AIs.
Are there any key readings you could share on this topic? I’ve come across arguments about AIs coordinating via DAOs or by reading each others’ source code, including in Andrew Critch’s RAAP. Is there any other good discussion of the topic?
Unfortunately I don’t know of a single reading that contains all the arguments.
This post is relevant and has some interesting discussion below IIRC.
Mostly I think the arguments for AIs being superior at coordination are: 1. It doesn’t seem like humans are near-optimal at coordination, so AIs will eventually be superior to humans at cooperation just like they will eventually be superior in most other things that humans can do but not near-optimally. 2. We can think of various fancy methods (such as involving reading source code, etc.) that AIs might use that humans don’t or can’t. 3. There seems to be a historical trend of increasing coordination ability / social tech, we should expect it to continue with AI. 4. Even if we just model AIs as somewhat smarter, more agentic, more rational humans… it still seems like that would probably be enough. Humans have coordinated coups and uprisings successfully before, if we imagine the conspirators are all mildly superhuman...
I think it might be possible to design an AI scheme in which the AIs don’t coordinate with each other even though it would be in their interest. E.g. perhaps if they are trained to be bad at coordinating, strongly rewarded for defecting on each other, etc. But I don’t think it’ll happen by default.
Are there any key readings you could share on this topic? I’ve come across arguments about AIs coordinating via DAOs or by reading each others’ source code, including in Andrew Critch’s RAAP. Is there any other good discussion of the topic?
Unfortunately I don’t know of a single reading that contains all the arguments.
This post is relevant and has some interesting discussion below IIRC.
Mostly I think the arguments for AIs being superior at coordination are:
1. It doesn’t seem like humans are near-optimal at coordination, so AIs will eventually be superior to humans at cooperation just like they will eventually be superior in most other things that humans can do but not near-optimally.
2. We can think of various fancy methods (such as involving reading source code, etc.) that AIs might use that humans don’t or can’t.
3. There seems to be a historical trend of increasing coordination ability / social tech, we should expect it to continue with AI.
4. Even if we just model AIs as somewhat smarter, more agentic, more rational humans… it still seems like that would probably be enough. Humans have coordinated coups and uprisings successfully before, if we imagine the conspirators are all mildly superhuman...
I think it might be possible to design an AI scheme in which the AIs don’t coordinate with each other even though it would be in their interest. E.g. perhaps if they are trained to be bad at coordinating, strongly rewarded for defecting on each other, etc. But I don’t think it’ll happen by default.