All of these operationalizations are about exact notions from the training setup.
Another important notion is revealed identity:
Does the AI think of the other AI as “itself” to the extent that it thinks about stuff like this at all?
Do the AIs cooperate in a way which is reasonably similar to cooperating with yourself?
All of these operationalizations are about exact notions from the training setup.
Another important notion is revealed identity:
Does the AI think of the other AI as “itself” to the extent that it thinks about stuff like this at all?
Do the AIs cooperate in a way which is reasonably similar to cooperating with yourself?