TimS comments on The Need for Human Friendliness

TimS 7 Mar 2013 14:38 UTC
0 points
Friendly means something like “will optimize for the appropriate complex human-like values correctly.”

Saying “we don’t have clear criteria for appropriate human values” is just another way of saying that defining Friendly is hard.

Provably Friendly means we have a mathematical proof that an AI will be Friendly before we start running the AI.

An AI that gives its designer ultimate power over humanity is almost certainly not Friendly, even if it was Provably designer-godlike-powers implementing.
- Elithrion 7 Mar 2013 18:54 UTC
  1 point
  Parent
  How do you define “appropriate”? It seems a little circular. Friendly AI is AI that optimises for appropriate values, and appropriate values are the ones for which we’d want a Friendly AI to optimise.
  
  You might say that “appropriate” values are ones which “we” would like to see the future optimised towards, but I think whether these even exist humanity-wide is an open question (and I’m leaning towards “no”), in which case you should probably have a contingency definition for what to do if they, in fact, do not.
  
  I would also be shocked if there were a “provable” definition of “appropriate” (as opposed to the friendliness of the program being provable with respect to some definition of “appropriate”).