drethelin comments on The Need for Human Friendliness

drethelin 7 Mar 2013 8:15 UTC
0 points
Very little. I don’t like my odds. If Eliezer has provable friendliness theorems but not an AI, it’s in his and everyone’s interest to distribute the generalized theorem to everyone possible so that anyone working on recursive AGI has a chance to make it friendly, which means the algorithms will be checked by many, publicly. If Eliezer has the theorems and an AI ready to implement, there’s nothing I can do about it at all. So why worry?
- Decius 7 Mar 2013 8:35 UTC
  −4 points
  Parent
  You could Trust Eliezer and everyone who checked his theorems and then put money towards the supercomputer cluster which is testing every string of bits for ‘friendly AI’.
  
  In fact, at that point that project should become humanity’s only notable priority.
  - wedrifid 7 Mar 2013 14:05 UTC
    1 point
    Parent
    
    You could Trust Eliezer and everyone who checked his theorems and then put money towards the supercomputer cluster which is testing every string of bits for ‘friendly AI’.
    
    In fact, at that point that project should become humanity’s only notable priority.
    
    Probably not. That project does not sound efficient (to the extent that it sounds unfeasible).
    - Decius 7 Mar 2013 17:33 UTC
      −1 points
      Parent
      Does the expected return on “this has a one in 2^(10^10) chance of generating a proven friendly AI” lose to anything else in terms of expected result? If so, we should be doing that to the exclusion of FAI research already.
      - wedrifid 7 Mar 2013 18:04 UTC
        2 points
        Parent
        
        Does the expected return on “this has a one in 2^(10^10) chance of generating a proven friendly AI” lose to anything else in terms of expected result?
        
        Yes. It loses to any sane strategy of finding an FAI that is not “testing every string of bits”.
        
        Testing every string of bits is a nice ‘proof of concept’ translation of “can prove a string of bits is an FAI if it in fact is an FAI” to “actually have an FAI”. It just isn’t one that works in practice unless, say, you have a halting oracle in your pocket.
        Decius 7 Mar 2013 23:27 UTC
        0 points
        Parent
        If I had a halting oracle, I would have AGI.
        
        By your logic, testing an actual AI against the proofs of friendliness isn’t worth the expense, because there might be an inconclusive result.
        wedrifid 8 Mar 2013 2:23 UTC
        0 points
        Parent
        
        If I had a halting oracle, I would have AGI.
        
        No you wouldn’t.
        
        By your logic, testing an actual AI against the proofs of friendliness isn’t worth the expense, because there might be an inconclusive result.
        
        No, my logic doesn’t claim that either.
        Decius 8 Mar 2013 4:55 UTC
        0 points
        Parent
        I write a program which recognizes AGI and halts when it finds one. It tests all programs less than 1 GB. If it halts, I narrow the search space. If it doesn’t, I expand the search space. Each search takes the time used by the halting oracle.
        
        We already have a way to prove that a string of bits is FAI; it is trivial to generalize that to AI.
        
        “I have a string of bits. I do not know if it is FAI or not, but I have a way of testing that is perfectly specific and very sensitive. The odds of an AI researcher writing a FAI are vanishingly small. Should I test it if it was written by an AI researcher? Should I test it if it isn’t?
        wedrifid 8 Mar 2013 8:59 UTC
        0 points
        Parent
        
        We already have a way to prove that a string of bits is FAI; it is trivial to generalize that to AI.
        
        Yes, if you have that proof and a halting oracle then you have AGI. That is vastly different to the claim “If I had a halting oracle I would have an AGI”.
        Decius 8 Mar 2013 19:47 UTC
        0 points
        Parent
        That proof was a prerequisite for the idea to try the brute force method of finding the FAI.
        
        Someone could also set an end condition that only an AGI is likely to ever reach and use iterated UTMs to find a program that reached the desired end condition. I’m not sure how to a priori figure out what someone smarter than me would do and test for it, but it isn’t the first thing I would do with a constant-time halting oracle. The first thing I would do is test a program that found additional implications of first order logic and halted when it found a contradiction. Then I would add the inverse of the Gödel statement and test that program.