Donald Hobson comments on How counterfactual are logical counterfactuals?

Donald Hobson 17 Dec 2024 12:43 UTC
LW: 2 AF: 1
0
AF
If two Logical Decision Theory agents with perfect knowledge of each other’s source code play prisoners dilemma, theoretically they should cooperate.
LDT uses logical counterfactuals in the decision making.
If the agents are CDT, then logical counterfactuals are not involved.
- JBlack 18 Dec 2024 0:56 UTC
  LW: 2 AF: 1
  0
  AF Parent
  If they have source code, then they are not perfectly rational and cannot in general implement LDT. They can at best implement a boundedly rational subset of LDT, which will have flaws.
  Assume the contrary: Then each agent can verify that the other implements LDT, since perfect knowledge of the other’s source code includes the knowledge that it implements LDT. In particular, each can verify that the other’s code implements a consistent system that includes arithmetic, and can run the other on their own source to consequently verify that they themselves implement a consistent system that includes arithmetic. This is not possible for any consistent system.
  The only way that consistency can be preserved is that at least one cannot actually verify that the other has a consistent deduction system including arithmetic. So at least one of those agents is not a LDT agent with perfect knowledge of each other’s source code.
  We can in principle assume perfectly rational agents that implement LDT, but they cannot be described by any algorithm and we should be extremely careful in making suppositions about what they can deduce about each other and themselves.
  - Jiro 18 Dec 2024 16:20 UTC
    LW: 6 AF: 1
    0
    AF Parent
    I get the impression that “has the agent’s source code” is some Yudkowskyism which people use without thinking.
    
    Every time someone says that, I always wonder “are you claiming that the agent that reads the source code is able to solve the Halting Problem?”
    - Donald Hobson 19 Dec 2024 14:30 UTC
      LW: 2 AF: 1
      0
      AF Parent
      The Halting problem is a worst case result. Most agents aren’t maximally ambiguous about whether or not they halt. And those that are, well then it depends what the rules are for agents that don’t halt.
      There are set ups where each agent is using an nonphysically large but finite amount of compute. There was a paper I saw somewhere a while ago where both agents were doing a brute force proof search for the statement “if I cooperate, then they cooperate” and cooperating if they found a proof.
      (Ie searching all proofs containing <10^100 symbols)
      - Jiro 19 Dec 2024 16:28 UTC
        2 points
        0
        Parent
        
        There are set ups where each agent is using an nonphysically large but finite amount of compute.
        
        In a situation where you are asking a question about an ideal reasoner, having the agents be finite means you are no longer asking it about an ideal reasoner. If you put an ideal reasoner in a Newcomb problem, he may very well think “I’ll simulate Omega and act according to what I find”. (Or more likely, some more complicated algorithm that indirectly amounts to that.) If the agent can’t do this, he may not be able to solve the problem. Of course, real humans can’t, but this may just mean that real humans are, because they are finite, unable to solve some problems.