rwallace comments on AI cooperation in practice

rwallace 30 Jul 2010 20:15 UTC
0 points
The most obvious implementation for the easy version will return 0 (or, if the length limit is lifted, fail to return at all).

It’s important, however, to be clear on just what constitutes a valid proof. Consider the following implementation:
```
main() {return 1;}
```
Not valid because it didn’t find a proof, you say? On the contrary, assuming we are dealing with idealized mathematical computers that are perfectly deterministic and have no outside inputs, the fact that a program returns a given value on this run does indeed constitute a proof that it will do so on every run.

The point being that, while you can use concepts like proof to draw rigorous conclusions, in the face of counterintuitive things like self reference, you need some experience actually nailing things down before you can be sure you’ve closed all the loopholes.
- cousin_it 31 Jul 2010 2:48 UTC
  2 points
  Parent
  
  The most obvious implementation for the easy version will return 0 (or, if the length limit is lifted, fail to return at all).
  
  I don’t understand this. Do you mean the proof checker fails to check anything? Wouldn’t that be a wrong implementation, rather than an obvious one?
  - rwallace 31 Jul 2010 3:17 UTC
    −1 points
    Parent
    No, I mean the obvious implementation would have the program basically asking “what does this program return? Well, it tries to find out what this program returns. What does this program return? Well, it tries to find out...” until it hits the length limit and comes back with no solution found (or, if the length limit is lifted, keeps going forever). This is true even if the proof checker always returns quickly.
    - cousin_it 31 Jul 2010 8:38 UTC
      8 points
      Parent
      This is absolutely not the obvious implementation of the algorithm in the post. My program does not “try to find out” something. It mechanically checks all proofs up to some maximum length. You can’t just replace a piece of code with another piece that has a vaguely similar intent, but fails to halt!
      - rwallace 31 Jul 2010 8:54 UTC
        0 points
        Parent
        The ‘fails to halt’ part in either case would be only if the length limit were removed. But you are right that mechanically checking all proofs up to a maximum length is a mathematically valid implementation and can only be replaced with another piece of code that is guaranteed to give an identical result.
        
        However, that implementation can still be reasonably described as trying to find out something. The intent can be seen not in the simple loop that iterates through all proofs up to a certain length, but in the choice of logic and axioms to encode the problem. The proofs generated from suitably chosen axioms are isomorphic to execution paths of an appropriately written program (as whpearson observed, a way to see this is to consider the Curry-Howard correspondence).
    - orthonormal 31 Jul 2010 18:35 UTC
      0 points
      Parent
      This isn’t the same as the calculator that asks itself what value it will return, because there’s a subtle separation of levels here.
- whpearson 30 Jul 2010 23:30 UTC
  1 point
  Parent
  The Curry-Howard correspondence is what you are referring to right?
  - mkehrt 31 Jul 2010 10:35 UTC
    0 points
    Parent
    That program does not prove its return value under the Curry-Howard Correspondence without a lot of footwork (see first caveat). Under the CH interpretation, programs are proofs that their types are inhabited, not that they can produce their return values. Thus, the canonical thing one could say the above program proves is “int”.
    
    Some caveats. First, one could imagine a sufficiently complex type system where there is a type of all integers which are 1. Then, one could say that the above program is a proof of the type “type-of-ints-that-are-1”. However, when adding this much specificity to one’s type system, one has to be careful that the resulting system can decidably type terms; adding richness often leads to undecidability.
    
    Second, there is simple constructive proof that the the above program produces 1. Just run it and see; if it returns 1 in finite time, that’s a proof.
  - rwallace 30 Jul 2010 23:36 UTC
    0 points
    Parent
    That is certainly one way to look at it.
    
    Another way to look at it is to consider purely computational proofs, such as the four color theorem, or the solutions to checkers and 5 x 5 Go.
- rwallace 30 Jul 2010 23:01 UTC
  1 point
  Parent
  I’m curious, can the downvoters give any reasons for disagreeing with me?
  - JenniferRM 30 Jul 2010 23:54 UTC
    6 points
    Parent
    I did not downvote (you were at −3 and −2 when I wrote this), but I think the there are many possible reasons to downvote you depending on the voter’s implicit or explicit theory of voting.
    
    This might be seen as a pure “puzzle” post, and you aren’t offering a puzzle solution that takes the question at face value. Your program doesn’t do anything quine-like (like put its own source code into a input variable), nor does it have any logic engine (even by implication).
    
    This is the kind of solution evolution might work out in very certain specific environments where it happened to “just work”, but if you put this code into a place where it actually had to “work for a living” by analyzing other programs using roughly the same routines it uses to analyze itself and it gained or lost by modulation of its behavior with them (and some programs were suckers, some were cheaters, and some could be implicitly bargained with, so there were real opportunities to gain or lose) this code would not give you “optimal play ignoring computation costs” (it is either a sucker or a cheater, depending on what “1″ means and in some contexts either of these strategies will lose).
    
    Maybe another way of putting the criticism is that your “clever trick” is in noticing that the problem can sort of be solved in practice by collapsing down to have no extra levels of self awareness. But for theoretical robustness in other circumstances it would be preferable to have a system that scales to infinity, so that agents capable of modeling other agent’s models of their own attempts to model other agents can still be “trusted by bad modelers”. (Remember that part of the unspoken motivation here is to build skills towards building a verifiably friendly AGI.)
    
    Also, the people here are all super smart, so telling them “just be dumber and people will trust you” isn’t an answer they want to hear, and saying this (even indirectly) could maybe be seen as an attempt to troll, or to drag people into a topic area that is too difficult to discus, given LW’s current sanity waterline (which is the only reason I personally vote things below zero).
    
    Another possibility is that people don’t see the reversed Tortoise and Achilles joke and just think (1) you said something dumb and (2) they should “punish dumbness” in order to maintain LW’s purity (IE the signal to noise ratio).
    
    The more reasons to downvote you expose, the more chance you have to accumulate a really low score, and that comment exposed several.
    - rwallace 31 Jul 2010 1:54 UTC
      3 points
      Parent
      Thanks for the analysis. I’ll try to clarify those points, then.
      
      Of course, I’m not suggesting my answer as a solution to the prisoner’s dilemma. I actually don’t think this kind of mathematical analysis is even relevant to the prisoner’s dilemma in practice—I think the latter is best dealt with by pre-commitments, reputation tracking etc. to change the payoff matrix and in particular change the assumption that the interaction is one-off—which it typically isn’t anyway. (Note that in cases involving actual criminals, we need witness protection programs to make it even approximately one-off.)
      
      As for the logic engine, hey, I just unrolled the search loop then deleted 3^^^^3 − 1 lines of unreachable code. No different than optimizing compilers do every day :-)
      
      If we look at the logic puzzle aspect though, this is in similar vein to this statement is true (the reverse of the classic this statement is false liar paradox—it has two consistent solutions instead of none—note that return 0 is just as valid a solution as return 1).
      
      And if you ask what sort of proof system permits that, you start investigating higher-order logic, and self-reference, and the classic paradoxes thereof, and type theory and other possible solutions thereto, which are topics that are of very real significance to software verification and other branches of AI.
      
      In summary, I think this puzzle goes places interesting enough that it’s actually worth taking apart and analyzing.
      - cousin_it 31 Jul 2010 9:37 UTC
        6 points
        Parent
        Actually, this statement is provable. This is the “Henkin sentence” and it is indeed provable in Peano arithmetic, so no need for stronger things like type theory etc. My proof in 1 is basically a translation of the proofs of the Henkin sentence and Löb’s theorem (which is used to prove the former).
        rwallace 31 Jul 2010 11:29 UTC
        −5 points
        Parent
        Yes, in that sense you’re right, this is isomorphic to the Henkin sentence, which is provable in first-order Peano arithmetic, if you set everything up just right in terms of precisely chosen axioms and their correspondence to the original code. (Note that there is no guarantee of such correspondence—return 0 is, after all, also a correct solution to the problem.)
        
        But the metaproof in which you showed this, and the sketch you gave of how you would go about implementing it, are not written in first-order PA. And by the time you have actually written out all the axioms, implemented the proof checker etc., it will be apparent that almost all the real work is not being done on the computer, but in your brain, using methods far more powerful than first-order PA.
        
        If we don’t regard this as a problem, then we should be willing to accept return 1 as a solution. But if we do regard this as a problem (and I think we should; apart from anything else, it means nearly all problems that would benefit from formal solution, don’t actually receive such, because there just isn’t that much skilled human labor to go around—this problem is itself an example of that; you didn’t have time to more than sketch the implementation) then we have to climb the ladder of more powerful formal techniques.
        rwallace 31 Jul 2010 14:23 UTC
        0 points
        Parent
        And once again I will ask the downvoters, which part of my comment do you disagree with?
  - Blueberry 31 Jul 2010 0:06 UTC
    −2 points
    Parent
    I upvoted the grandparent and parent, because what you said seems right to me. I wish people wouldn’t downvote someone asking why e was downvoted.
    - Cameron_Taylor 31 Jul 2010 1:39 UTC
      −1 points
      Parent
      Upvoted entire ancestor tree, for similar reasons.