“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.
No, but I’m guessing I’m about to.
“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
In general, specifying an oracle/tool AI is not safe: http://lesswrong.com/lw/cze/reply_to_holden_on_tool_ai/
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.