To be a trustworthy tool rather than a potential trap, the source code to Y has to be completely accurate and has to have final decision making authority. Y’s programmer has to be able to accurately say “Here is enough information to predict any decision that my irrevokably delegated representative would ever possibly make in this interaction.” Saying that this is “without traditional means of communication” is technically accurate but very deceptive. Saying that this is “no actual communication” is outright backwards; if anything it’s much more communication than is traditionally imagined.
Unlimited communication, even. In the hypothetical “AGIs can inspect each others’ source code” case, the AGIs could just as well run that source code and have two separate conversations, one between each original and its counterpart’s copy. If the AGIs’ senses of ethics were sufficiently situation-dependent, then to generate useful proofs they’d need to be able to inspect copies of each others’ current state as well, in which case the two separate conversations might well be identical.
This is all true, but you can come up with situations where exchanging source code is more relevant. For instance, Robin Hanson has frequently argued that agents will diverge from each other as they explore the universe, and that someone will start burning the cosmic commons. This is a Prisoner’s Dilemma without traditional communication (since signals are limited by lightspeed, it would be too late to stop someone distant from defecting once you see they’ve started). But the “exchange of source code” kind of coordination is more feasible.
Also, I don’t know whether anyone polled traditional game theorists, but I’d bet that some of them would have expected it to be impossible, even with exchange of source codes, to achieve anything better than what CliqueBot does.
To be a trustworthy tool rather than a potential trap, the source code to Y has to be completely accurate and has to have final decision making authority. Y’s programmer has to be able to accurately say “Here is enough information to predict any decision that my irrevokably delegated representative would ever possibly make in this interaction.” Saying that this is “without traditional means of communication” is technically accurate but very deceptive. Saying that this is “no actual communication” is outright backwards; if anything it’s much more communication than is traditionally imagined.
Unlimited communication, even. In the hypothetical “AGIs can inspect each others’ source code” case, the AGIs could just as well run that source code and have two separate conversations, one between each original and its counterpart’s copy. If the AGIs’ senses of ethics were sufficiently situation-dependent, then to generate useful proofs they’d need to be able to inspect copies of each others’ current state as well, in which case the two separate conversations might well be identical.
This is all true, but you can come up with situations where exchanging source code is more relevant. For instance, Robin Hanson has frequently argued that agents will diverge from each other as they explore the universe, and that someone will start burning the cosmic commons. This is a Prisoner’s Dilemma without traditional communication (since signals are limited by lightspeed, it would be too late to stop someone distant from defecting once you see they’ve started). But the “exchange of source code” kind of coordination is more feasible.
Also, I don’t know whether anyone polled traditional game theorists, but I’d bet that some of them would have expected it to be impossible, even with exchange of source codes, to achieve anything better than what CliqueBot does.