cousin_it comments on BOOK DRAFT: ‘Ethics and Superintelligence’ (part 1)

cousin_it 14 Feb 2011 13:56 UTC
13 points
wedrifid is right: if you’re now counting on failsafes to stop CEV from doing the wrong thing, that means you could apply the same procedures to any other proposed AI, so the real value of your life’s work is in the failsafe, not in CEV. What happened to all your clever arguments saying you can’t put external chains on an AI? I just don’t understand this at all.
- Vladimir_Nesov 14 Feb 2011 14:53 UTC
  8 points
  Parent
  Any given FAI design can turn out to be unable to do the right thing, which corresponds to tripping failsafes, but to be a FAI it must also be potentially capable (for all we know) of doing the right thing. Adequate failsafe should just turn off an ordinary AGI immediately, so it won’t work as an AI-in-chains FAI solution. You can’t make AI do the right thing just by adding failsafes, you also need to have a chance of winning.
  - Eliezer Yudkowsky 14 Feb 2011 16:29 UTC
    0 points
    Parent
    Affirmed.
- wedrifid 15 Feb 2011 8:53 UTC
  5 points
  Parent
  
  wedrifid is right: if you’re now counting on failsafes to stop CEV from doing the wrong thing, that means you could apply the same procedures to any other proposed AI, so the real value of your life’s work is in the failsafe, not in CEV.
  
  Since my name was mentioned I had better confirm that I generally agree with your point but would have left out this sentence:
  
  What happened to all your clever arguments saying you can’t put external chains on an AI?
  
  I don’t disagree with the principle of having a failsafe—and don’t think it is incompatible with the aforementioned clever arguments. But I do agree that “but there is a failsafe” is an utterly abysmal argument in favour of preferring CEV over an alternative AI goal system.
  
  I just don’t understand this at all.
  
  Tell me about it. With most people if they kept asking the same question when the answer is staring them in the face and then act oblivious as it is told to them repeatedly I dismiss them as either disingenuous or (possibly selectively) stupid in short order. But, to borrow wisdom from HP:MoR:
  
  …. that just doesn’t sound like /Eliezer’s/ style.
  
  …but you can only think that thought so many times, before you start to wonder about the trustworthiness of that whole ‘style’ concept.