Well, I don’t take seriously any of these speculations about God-like vs. merely angel-like creations. They’re just a distraction from the task of actually building them, which no-one knows how to do anyway.
Because we have no idea how hard it is to specify either. If, along the way it turns out to be easy to specify wFAI and risky to specify sFAI, then the reasonable course is expected. Doubly so since a wFAI would almost certainly be useful in helping specify a sFAI.
Seeing as human values are a miniscule target, it seems probable that specifying wFAI is harder than sFAI though.
Why would it be harder? One could tell the wFAI improve factors that are strongly correlated with human values, such as food stability, resources that cure preventable diseases (such as diarrhea, which, as we know, kills way more people than it should) and security from natural disasters.
“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.
Well, I don’t take seriously any of these speculations about God-like vs. merely angel-like creations. They’re just a distraction from the task of actually building them, which no-one knows how to do anyway.
But still, if a wFAI was capable of eliminating those things, why be picky and try for sFAI?
Because we have no idea how hard it is to specify either. If, along the way it turns out to be easy to specify wFAI and risky to specify sFAI, then the reasonable course is expected. Doubly so since a wFAI would almost certainly be useful in helping specify a sFAI.
Seeing as human values are a miniscule target, it seems probable that specifying wFAI is harder than sFAI though.
“Specify”? What do you mean?
specifications a la programming.
Why would it be harder? One could tell the wFAI improve factors that are strongly correlated with human values, such as food stability, resources that cure preventable diseases (such as diarrhea, which, as we know, kills way more people than it should) and security from natural disasters.
Because if you screw up specifying human values you don’t get wFAI you just die (hopefully).
It’s not optimizing human values, it’s optimizing circumstances that are strongly correlated with human values. It would be a logistics kind of thing.
Have you ever played corrupt a wish?
No, but I’m guessing I’m about to.
“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
In general, specifying an oracle/tool AI is not safe: http://lesswrong.com/lw/cze/reply_to_holden_on_tool_ai/
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.