orthonormal comments on Superintelligent AGI in a box—a question.

orthonormal 25 Feb 2012 15:20 UTC
3 points
A couple of things:
- To be precise, you’re offering an approach to safe Oracle AI rather than Friendly AI.
- In a nutshell, what I like about the idea is that you’re explicitly handicapping your AI with a utility function that only cares about its immediate successor rather than its eventual descendants. It’s rather like the example I posed where a UDT agent with an analogously myopic utility function allowed itself to be exploited by a pretty dumb program. This seems a lot more feasible than trying to control an agent that can think strategically about its future iterations.
- To expand on my questions, note that in human beings, the sort of creativity that helps us write more efficient algorithms on a given problem is strongly correlated with the sort of creativity that lets people figure out why they’re being asked the specific questions they are. If a bit of meta-gaming comes in handy at any stage, if modeling the world that originated these questions wins (over the alternatives it enumerated at that stage) on criteria 3 even once, then we might be in trouble.