orthonormal comments on Robust Cooperation in the Prisoner’s Dilemma

orthonormal Jun 19, 2013, 4:45 PM
1 point
There’s a hierarchy for existence of short proofs similar to the hierarchy for existence of proofs in this tower of formal systems. (If there weren’t, then these agents wouldn’t be much discussing! As it is, though, there are bounded algorithmic analogues of all these modal agents.) However, proving properties of such algorithms gets a lot trickier there, as you need to keep track of how long it takes to make all the trivial deductions in addition to Lobian ones.

Essentially, what you’re suggesting is something like an algorithmic analogue of the modal agent you considered before: searching for short proofs of exploitability, then longer proofs of mutual cooperation, is akin to searching for exploitability in a lower formal system and then mutual cooperation in a higher one. Not a bad idea! But when we transfer it to the easier-to-work-with context, it’s clear that waiting for a proof search to terminate must cost you against anything cooperative but impatient/low-powered. (Explicitly, the algorithm you suggested can never cooperate with an agent that only looks for proofs of cooperation up to length N.)

One last thing: all proofs in PA of the form “there does not exist a proof in PA of X with length less than N” must have length at least N, for analogous reasons to the fact that PA cannot prove its own consistency.
- Pentashagon Jun 19, 2013, 10:06 PM
  0 points
  Parent
  
  Essentially, what you’re suggesting is something like an algorithmic analogue of the modal agent you considered before: searching for short proofs of exploitability, then longer proofs of mutual cooperation, is akin to searching for exploitability in a lower formal system and then mutual cooperation in a higher one. Not a bad idea! But when we transfer it to the easier-to-work-with context, it’s clear that waiting for a proof search to terminate must cost you against anything cooperative but impatient/low-powered. (Explicitly, the algorithm you suggested can never cooperate with an agent that only looks for proofs of cooperation up to length N.)
  
  What about optimizing proof search order? An extreme example would just be an agent with a lookup table indexed by the Goedel number of each possible agent up to a maximum value of some N, with either Defect or Cooperate in the lookup table which is the action the agent will take. If impatient agents came first in the lookup table they could quickly prove cooperation. The agent could append its quine to the end of the lookup table before searching in order to cooperate with itself. There should be an optimal lookup table for any length N and set S of games against other agents, defined as the lookup table that generates the highest score when playing all the games in set S.
  
  Instead of including the lookup table the agent could perform the same sort of proof search necessary to generate the lookup table at runtime. If it searched for proofs in roughly the same order as the corresponding entries in the lookup table then impatient opposing agents would conveniently find a cooperation proof early in the simulation of the agent. In this case there would be an optimal list of rules for proof search order (perhaps including direct proofs for the most likely impatient opposing agents) that would approach the optimality of the lookup table, or perhaps exceed it if enough opposing agents give up when they see the size of the lookup-table-agent.
  
  I have no idea if any of this is theoretically interesting or just a hack to potentially play PD slightly better.
  
  One last thing: all proofs in PA of the form “there does not exist a proof in PA of X with length less than N” must have length at least N, for analogous reasons to the fact that PA cannot prove its own consistency.
  
  And presumably more like O(2^N) in practice? I don’t know what percentage of bit strings are well formed formulas in an average formal language, but I would imagine it’s a significant fraction. The proof searcher would have to check every wff to completion because it doesn’t have Con(PA) to rule out contradictions, and I assumed that most of the proofs would be of the form “First the agent writes a 0 and moves the read-head right and goes to state 5, … , then the agent halts in a Cooperating state”, so N would probably be very large.