rwallace comments on AI cooperation in practice

rwallace 30 Jul 2010 23:01 UTC
1 point
I’m curious, can the downvoters give any reasons for disagreeing with me?
- JenniferRM 30 Jul 2010 23:54 UTC
  6 points
  Parent
  I did not downvote (you were at −3 and −2 when I wrote this), but I think the there are many possible reasons to downvote you depending on the voter’s implicit or explicit theory of voting.
  
  This might be seen as a pure “puzzle” post, and you aren’t offering a puzzle solution that takes the question at face value. Your program doesn’t do anything quine-like (like put its own source code into a input variable), nor does it have any logic engine (even by implication).
  
  This is the kind of solution evolution might work out in very certain specific environments where it happened to “just work”, but if you put this code into a place where it actually had to “work for a living” by analyzing other programs using roughly the same routines it uses to analyze itself and it gained or lost by modulation of its behavior with them (and some programs were suckers, some were cheaters, and some could be implicitly bargained with, so there were real opportunities to gain or lose) this code would not give you “optimal play ignoring computation costs” (it is either a sucker or a cheater, depending on what “1″ means and in some contexts either of these strategies will lose).
  
  Maybe another way of putting the criticism is that your “clever trick” is in noticing that the problem can sort of be solved in practice by collapsing down to have no extra levels of self awareness. But for theoretical robustness in other circumstances it would be preferable to have a system that scales to infinity, so that agents capable of modeling other agent’s models of their own attempts to model other agents can still be “trusted by bad modelers”. (Remember that part of the unspoken motivation here is to build skills towards building a verifiably friendly AGI.)
  
  Also, the people here are all super smart, so telling them “just be dumber and people will trust you” isn’t an answer they want to hear, and saying this (even indirectly) could maybe be seen as an attempt to troll, or to drag people into a topic area that is too difficult to discus, given LW’s current sanity waterline (which is the only reason I personally vote things below zero).
  
  Another possibility is that people don’t see the reversed Tortoise and Achilles joke and just think (1) you said something dumb and (2) they should “punish dumbness” in order to maintain LW’s purity (IE the signal to noise ratio).
  
  The more reasons to downvote you expose, the more chance you have to accumulate a really low score, and that comment exposed several.
  - rwallace 31 Jul 2010 1:54 UTC
    3 points
    Parent
    Thanks for the analysis. I’ll try to clarify those points, then.
    
    Of course, I’m not suggesting my answer as a solution to the prisoner’s dilemma. I actually don’t think this kind of mathematical analysis is even relevant to the prisoner’s dilemma in practice—I think the latter is best dealt with by pre-commitments, reputation tracking etc. to change the payoff matrix and in particular change the assumption that the interaction is one-off—which it typically isn’t anyway. (Note that in cases involving actual criminals, we need witness protection programs to make it even approximately one-off.)
    
    As for the logic engine, hey, I just unrolled the search loop then deleted 3^^^^3 − 1 lines of unreachable code. No different than optimizing compilers do every day :-)
    
    If we look at the logic puzzle aspect though, this is in similar vein to this statement is true (the reverse of the classic this statement is false liar paradox—it has two consistent solutions instead of none—note that return 0 is just as valid a solution as return 1).
    
    And if you ask what sort of proof system permits that, you start investigating higher-order logic, and self-reference, and the classic paradoxes thereof, and type theory and other possible solutions thereto, which are topics that are of very real significance to software verification and other branches of AI.
    
    In summary, I think this puzzle goes places interesting enough that it’s actually worth taking apart and analyzing.
    - cousin_it 31 Jul 2010 9:37 UTC
      6 points
      Parent
      Actually, this statement is provable. This is the “Henkin sentence” and it is indeed provable in Peano arithmetic, so no need for stronger things like type theory etc. My proof in 1 is basically a translation of the proofs of the Henkin sentence and Löb’s theorem (which is used to prove the former).
      - rwallace 31 Jul 2010 11:29 UTC
        −5 points
        Parent
        Yes, in that sense you’re right, this is isomorphic to the Henkin sentence, which is provable in first-order Peano arithmetic, if you set everything up just right in terms of precisely chosen axioms and their correspondence to the original code. (Note that there is no guarantee of such correspondence—return 0 is, after all, also a correct solution to the problem.)
        
        But the metaproof in which you showed this, and the sketch you gave of how you would go about implementing it, are not written in first-order PA. And by the time you have actually written out all the axioms, implemented the proof checker etc., it will be apparent that almost all the real work is not being done on the computer, but in your brain, using methods far more powerful than first-order PA.
        
        If we don’t regard this as a problem, then we should be willing to accept return 1 as a solution. But if we do regard this as a problem (and I think we should; apart from anything else, it means nearly all problems that would benefit from formal solution, don’t actually receive such, because there just isn’t that much skilled human labor to go around—this problem is itself an example of that; you didn’t have time to more than sketch the implementation) then we have to climb the ladder of more powerful formal techniques.
        rwallace 31 Jul 2010 14:23 UTC
        0 points
        Parent
        And once again I will ask the downvoters, which part of my comment do you disagree with?
- Blueberry 31 Jul 2010 0:06 UTC
  −2 points
  Parent
  I upvoted the grandparent and parent, because what you said seems right to me. I wish people wouldn’t downvote someone asking why e was downvoted.
  - Cameron_Taylor 31 Jul 2010 1:39 UTC
    −1 points
    Parent
    Upvoted entire ancestor tree, for similar reasons.