I think this is built out of several deeply misunderstood ideas.
If we get 2 AI’s, where both AI’s are somehow magically aligned (highly unlikely), we are in a pretty good situation. A serious fight between the AI’s would satisfy neither party. So either one AI quietly hacks the other, turning it off with minimal destruction, or the AI’s cooperate, as they have a pretty similar utility function and can find a future both like.
Nowhere does FDT assume other actors have the same utility function as it. Why do you think it assumes that. It doesn’t assume the other agent is FDT. It doesn’t make any silly assumptions like that. If both agents are FDT, and have common knowledge of each others source code, they will cooperate, even if their goals are wildly different.
With a high bandwidth internet link, and logically precise statements, we won’t get serious miscommunication.
If both agents are FDT, and have common knowledge of each others source code
Any common knowledge they can draw up can go into a coordinating agent (adjudicator), all it needs is to be shared among the coalition, it doesn’t need to have any particular data. The problem is verifying that all members of the coalition will follow the policy chosen by the coordinating agent, and common knowledge of source code is useful for that. But it could just be the source code of the trivial rule of always following the policy given by the coordinating agent.
One possible policy chosen by the adjudicator should be falling back to unshared/private BATNA, aborting the bargain, and of course doing other things not in scope of this particular bargain. These things are not parts of the obey-the-adjudicator algorithm, but consequences of following it. So common knowledge of everything is not needed, only common knowledge of the adjudicator and its authority over the coalition. (This is also a possible way of looking at UDT, where a single agent in many possible states acting through many possible worlds coordinates among its variants.)
I think this is built out of several deeply misunderstood ideas.
If we get 2 AI’s, where both AI’s are somehow magically aligned (highly unlikely), we are in a pretty good situation. A serious fight between the AI’s would satisfy neither party. So either one AI quietly hacks the other, turning it off with minimal destruction, or the AI’s cooperate, as they have a pretty similar utility function and can find a future both like.
Nowhere does FDT assume other actors have the same utility function as it. Why do you think it assumes that. It doesn’t assume the other agent is FDT. It doesn’t make any silly assumptions like that. If both agents are FDT, and have common knowledge of each others source code, they will cooperate, even if their goals are wildly different.
With a high bandwidth internet link, and logically precise statements, we won’t get serious miscommunication.
Any common knowledge they can draw up can go into a coordinating agent (adjudicator), all it needs is to be shared among the coalition, it doesn’t need to have any particular data. The problem is verifying that all members of the coalition will follow the policy chosen by the coordinating agent, and common knowledge of source code is useful for that. But it could just be the source code of the trivial rule of always following the policy given by the coordinating agent.
One possible policy chosen by the adjudicator should be falling back to unshared/private BATNA, aborting the bargain, and of course doing other things not in scope of this particular bargain. These things are not parts of the obey-the-adjudicator algorithm, but consequences of following it. So common knowledge of everything is not needed, only common knowledge of the adjudicator and its authority over the coalition. (This is also a possible way of looking at UDT, where a single agent in many possible states acting through many possible worlds coordinates among its variants.)