paulfchristiano comments on Cryptographic Boxes for Unfriendly AI

paulfchristiano 18 Dec 2010 18:55 UTC
14 points
The probability I assign to being able to build a friendly AI directly before being able to build a hostile AI is very low. You have thought more about the problem, but I’m not really convinced. I guess we can both be right concurrently, and then we are in trouble.

I will say that I think you underestimate how powerful allowing a superintelligence to write a proof for you is. The question is not really whether you have proof techniques to verify friendliness. It is whether you have a formal language expressive enough to describe friendliness in which a transhuman can find a proof. Maybe that is just as hard as the original problem, because even formally articulating friendliness is incredibly difficult.
- timtyler 19 Dec 2010 19:27 UTC
  1 point
  Parent
  Usually, verifying a proof is considerably easier than finding one—and it doesn’t seem at all unreasonable to use a machine to find a proof—if you are looking for one.