Eliezer Yudkowsky comments on Prisoner’s Dilemma vs the Afterlife

Eliezer Yudkowsky 26 Sep 2013 1:20 UTC
2 points
http://intelligence.org/files/RobustCooperation.pdf

Especially now that this is published, I no longer feel much of a need to engage with the hypothesis that rational agents mutually defect in the oneshot or iterated PD. Perhaps you meant to analyze causal-decision-theory agents? But this would be of only academic interest.
- cousin_it 26 Sep 2013 18:00 UTC
  5 points
  Parent
  Funny, when talking to Patrick at the workshop I made pretty much the opposite point. Maybe worth spelling it out here, since I came up with Lobian cooperation in the first place:
  
  The PD over modal agents is just another game-theoretic problem. As the zoo of proposed modal agents grows, our failure to find a unique “rational” modal agent is a reflection of our inability to find a unique “rational” strategy in an arbitrary game. Waging war on an established result is typically a bad idea, we probably won’t roll back the clock on game theory and reduce n-player to 1-player. This particular game is still worth investigating, but I don’t hope to find any unique notion of rationality in there.
  
  Without a unique notion of rationality, it seems premature to say that rational agents won’t play a game in a certain way. Who knows what limitations they might have? For example, PrudentBot based on PA will defect against PrudentBot based on PA+1.
- DataPacRat 26 Sep 2013 14:15 UTC
  2 points
  Parent
  I’ve only had time to read the introduction so far; but if it’s not mentioned in the paper itself, it seems that PrudentBot should not only be “correct” if it defects against CooperateBot, it should also defect against DefectBot. In fact, in a one-shot PD, it seems as if it should defect against any Bot which is unable to analyze its own source code to see how it will react.
  
  It seems as if there’s an important parallel between the Iterated Prisoner’s Dilemma and the One-Shot Prisoner’s Dilemma With Access To Source Code: both versions of the PD provide a set of evidence which each side can use to attempt to predict the other’s behaviour. And since the PD-with-source is, according to the paper, equivalent to Newcomb’s Problem, this suggests that the Iterated-PD is equivalent to a variant of Newcomb’s based on reasonably-available historical evidence rather than Omega-level omniscience about the other player.
  
  This also suggests that an important dividing line between algorithms one should defect against, and algorithms one should cooperate with, is somewhere around “complicated enough to be able to take my own actions into account when deciding its own actions”. For PD-with-source, that means being complicated enough to analyze source code; Iterated-PD’s structure puts that line at tit-for-tat.
  
  This is also implying a certain intuitive leap to me, involving species with complicated enough social interactions to need to think about others’ minds (parrots, dolphins, apes); that the runaway evolutionary process that led to our own species perhaps has to do with such mind-modeling finally becoming complicated enough to model one’s own mind for higher-level social plots… but that’s more likely than not just some college-freshmen-level “say, what if...” musing. It could be just as likely that the big step forward was minds becoming complicated enough to become less-predictable black-boxes than simple predictable “if I cheat on him and he catches me, he’ll peck me painfully” call-and-responses.
- V_V 27 Sep 2013 0:59 UTC
  −1 points
  Parent
  Actually, Prisoner’s Dilemma between programs with each other source code (aka the program equilibrium setting) is a much different problem than both oneshot and iterated Prisoner’s Dilemma.
  
  The main attractiveness of both oneshot and iterated PD is that, despite their extreme simplicity, they provide a suprisingly good model of many common problems humans face in real-world social interactions.
  On the other hand, Program PD is a much more artificial scenario, even in a speculative world of robots, or world of software agents. It is theoretical interesting to analyze that scenario, but even if an unabiguously satisfactory solution for it was found (and I think your paper doesn’t, but I’ll leave that for another post * ), it would be far fetched to claim that it would essentially solve all practical instances of PD.
  
  ( * ) Short story: PrudentBot cooperates too much. (PrudentBot, PrudentBot) is an unstable payoff-dominant Nash equilibrium. It’s not even a Nash equilibrium under a small modification of the game that penalizes program complexity.
  CliqueBots and generalized CliqueBots (bots that recognize each other with a criterion less strict than perfect textual equivalence but strict enough to guarantee functional equivalence) are better since they are stable payoff-dominant Nash equilibria and they never fail to exploit an exploitable opponent.