Akash comments on A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

Akash 17 Dec 2024 20:15 UTC
2 points
0
How well LLMs follow which decision theory affects their ability to cooperate. This could mean the difference between peace and conflict in AI-assisted political bargaining or enable AIs to collude when one is meant to monitor the other, undermining human control.
Do you have any thoughts on “red lines” for AI collusion? That is, “if an AI could do X, then we should acknowledge that AIs can likely collude with each other in monitoring setups.”