DanielLC comments on Work on Security Instead of Friendliness?

DanielLC 25 Jul 2012 0:44 UTC
0 points
Everything a human can do, a human cannot do in the most extreme possible manner. An AI could be made to wirehead easier or harder. It could think faster or slower. It could be more creative or less creative. It could be nicer or meaner.

I wouldn’t begin to know how to build an AI that’s improved in all the right ways. It might not even be humanly possible. If it’s not humanly possible to build a good AI, it’s likely impossible for the AI to be able to improve on itself. There’s still a good chance that it would work.
- timtyler 25 Jul 2012 9:52 UTC
  0 points
  Parent
  
  An AI could be made to wirehead easier or harder.
  
  Probably true—and few want wireheading machines—but the issues are the scale of the technical challenges, and—if these are non-trivial—how much folk will be prepared to pay for the feature. In a society of machines, maybe the occasional one that turns Buddhist—and needs to go back to the factory for psychological repairs—is within tolerable limits.
  
  Many apparently think that making machines value “external reality” fixes the wirehead problem—e.g. see “Model-based Utility Functions”—but it leads directly to the problems of what you mean by “external reality” and how to tell a machine that that is what it is supposed to be valuing. It doesn’t look much like solving the problem to me.