If they have source code, then they are not perfectly rational and cannot in general implement LDT. They can at best implement a boundedly rational subset of LDT, which will have flaws.
Assume the contrary: Then each agent can verify that the other implements LDT, since perfect knowledge of the other’s source code includes the knowledge that it implements LDT. In particular, each can verify that the other’s code implements a consistent system that includes arithmetic, and can run the other on their own source to consequently verify that they themselves implement a consistent system that includes arithmetic. This is not possible for any consistent system.
The only way that consistency can be preserved is that at least one cannot actually verify that the other has a consistent deduction system including arithmetic. So at least one of those agents is not a LDT agent with perfect knowledge of each other’s source code.
We can in principle assume perfectly rational agents that implement LDT, but they cannot be described by any algorithm and we should be extremely careful in making suppositions about what they can deduce about each other and themselves.
The Halting problem is a worst case result. Most agents aren’t maximally ambiguous about whether or not they halt. And those that are, well then it depends what the rules are for agents that don’t halt.
There are set ups where each agent is using an nonphysically large but finite amount of compute. There was a paper I saw somewhere a while ago where both agents were doing a brute force proof search for the statement “if I cooperate, then they cooperate” and cooperating if they found a proof.
(Ie searching all proofs containing <10^100 symbols)
There are set ups where each agent is using an nonphysically large but finite amount of compute.
In a situation where you are asking a question about an ideal reasoner, having the agents be finite means you are no longer asking it about an ideal reasoner. If you put an ideal reasoner in a Newcomb problem, he may very well think “I’ll simulate Omega and act according to what I find”. (Or more likely, some more complicated algorithm that indirectly amounts to that.) If the agent can’t do this, he may not be able to solve the problem. Of course, real humans can’t, but this may just mean that real humans are, because they are finite, unable to solve some problems.
If two Logical Decision Theory agents with perfect knowledge of each other’s source code play prisoners dilemma, theoretically they should cooperate.
LDT uses logical counterfactuals in the decision making.
If the agents are CDT, then logical counterfactuals are not involved.
If they have source code, then they are not perfectly rational and cannot in general implement LDT. They can at best implement a boundedly rational subset of LDT, which will have flaws.
Assume the contrary: Then each agent can verify that the other implements LDT, since perfect knowledge of the other’s source code includes the knowledge that it implements LDT. In particular, each can verify that the other’s code implements a consistent system that includes arithmetic, and can run the other on their own source to consequently verify that they themselves implement a consistent system that includes arithmetic. This is not possible for any consistent system.
The only way that consistency can be preserved is that at least one cannot actually verify that the other has a consistent deduction system including arithmetic. So at least one of those agents is not a LDT agent with perfect knowledge of each other’s source code.
We can in principle assume perfectly rational agents that implement LDT, but they cannot be described by any algorithm and we should be extremely careful in making suppositions about what they can deduce about each other and themselves.
I get the impression that “has the agent’s source code” is some Yudkowskyism which people use without thinking.
Every time someone says that, I always wonder “are you claiming that the agent that reads the source code is able to solve the Halting Problem?”
The Halting problem is a worst case result. Most agents aren’t maximally ambiguous about whether or not they halt. And those that are, well then it depends what the rules are for agents that don’t halt.
There are set ups where each agent is using an nonphysically large but finite amount of compute. There was a paper I saw somewhere a while ago where both agents were doing a brute force proof search for the statement “if I cooperate, then they cooperate” and cooperating if they found a proof.
(Ie searching all proofs containing <10^100 symbols)
In a situation where you are asking a question about an ideal reasoner, having the agents be finite means you are no longer asking it about an ideal reasoner. If you put an ideal reasoner in a Newcomb problem, he may very well think “I’ll simulate Omega and act according to what I find”. (Or more likely, some more complicated algorithm that indirectly amounts to that.) If the agent can’t do this, he may not be able to solve the problem. Of course, real humans can’t, but this may just mean that real humans are, because they are finite, unable to solve some problems.