pjeby comments on What can you do with an Unfriendly AI?

pjeby 21 Dec 2010 14:28 UTC
3 points
“ask AI copy 1 for permission for all future actions. Never modify AI copy 1′s behavior.”

I don’t think you’ve noticed that this is just moving the fundamental problem to a different place. For example, you haven’t specified things like:
- Don’t lie to AI 1 about your actions
- Don’t persuade AI 1 to modify itself
- Don’t find loopholes in the definition of “AI 1” or “modify”
etc., etc. If you could enforce all these things over superintelligent self-modification, you’d already have solved the general FAI problem.

IOW, what you propose isn’t actually a reduction of anything, AFAICT.
- DanArmak 21 Dec 2010 15:47 UTC
  0 points
  Parent
  I noticed this but didn’t explicitly point it out. My point was that when paulfchristiano said:
  
  If the AI has a simple goal—press the button—then I think it is materially easier for the AI to modify itself while preserving the button-pressing goal [...] the problem is difficult, but I don’t think it is in the same league as friendliness
  
  He was also assuming that he could handle your objections, e.g. that his AI wouldn’t find a loophole in the definition of “pressing a button”. So the problem he described was not, in fact, simpler than the general problem of FAI.