Wei Dai comments on Q&A with new Executive Director of Singularity Institute

Wei Dai Nov 15, 2011, 10:08 PM
39 points
What I’m afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or the formalization of the notion of “safety” used by the proof is wrong. This kind of thing happens a lot in cryptography, if you replace “safety” with “security”. These mistakes are still occurring today, even after decades of research into how to do such proofs and what the relevant formalizations are. From where I’m sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations. There’s good recent review of the history of provable security, titled Provable Security in the Real World, which might help you understand where I’m coming from.
What links here?
- cousin_it Nov 16, 2011, 2:23 PM
  12 points
  Parent
  Your comment has finally convinced me to study some practical crypto because it seems to have fruitful analogies to FAI. It’s especially awesome that one of the references in the linked article is “An Attack Against SSH2 Protocol” by W. Dai.
  - gwern Nov 17, 2011, 1:24 AM
    8 points
    Parent
    More than fruitful analogies, I’d say: http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/
- John_Maxwell Mar 23, 2012, 6:51 AM
  4 points
  Parent
  
  From where I’m sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations.
  
  Correct me if I’m wrong, but it doesn’t seem as though “proofs” of algorithm correctness fail as frequently as “proofs” of cryptosystem unbreakableness.
  
  Where does your intuition that friendliness proofs are on the order of reliability of cryptosystem proofs come from?
  - Wei Dai Mar 23, 2012, 7:07 AM
    13 points
    Parent
    Interesting question. I guess proofs of algorithm correctness fail less often because:
    
    It’s easier to empirically test algorithms to weed out the incorrect ones, so there are fewer efforts to prove conjectures of correctness that are actually false.
    It’s easier to formalize what it means for an algorithm to be correct than for a cryptosystem to be secure.
    
    In both respects, proving Friendliness seems even worse than proving security.
- CarlShulman Nov 15, 2011, 10:25 PM
  0 points
  Parent
  
  What I’m afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or that the formalization of the notion of “safety” used by the proof is wrong.
  
  Thanks for clarifying.
  
  This kind of thing happens a lot in cryptography,
  
  I agree.