I am aware of the need for those things (part of what I mean by (need for friendliness in OAI) but as far as I can tell, Paternalistic FAI requires you to solve those problems, plus simple ‘not being very powerful but insane’, plus basic understandings of what matters to humans, plus incredibly meta human values matters. An OAI can leave off the last one of those problems.
I meant that by going meta we might not have to solve them fully.
All the problems you list sound nearly identical to me. In particular, “what matters to humans” sounds more vague but just as meta. If it includes enough details to actually reassure me, you could just tell the AI, “Do that.” Presumably what matters to us would include ‘the ability to affect our environment, eg by giving orders.’ What do you mean by “very powerful but insane”? I want to parse that as ‘intelligent in the sense of having accurate models that allow it to shape the future, but not programmed to do what matters to humans.’
“very powerful but insane” : AI’s response to orders seem to make less than no sense, yet AI is still able to do damage.
“What matters to humans”: Things like the Outcome Pump example, where any child would know that not dying is supposed to be part of “out of the building”, but not including the problems that we are bad at solving, such as fun theory and the like.
I am aware of the need for those things (part of what I mean by (need for friendliness in OAI) but as far as I can tell, Paternalistic FAI requires you to solve those problems, plus simple ‘not being very powerful but insane’, plus basic understandings of what matters to humans, plus incredibly meta human values matters. An OAI can leave off the last one of those problems.
I meant that by going meta we might not have to solve them fully.
All the problems you list sound nearly identical to me. In particular, “what matters to humans” sounds more vague but just as meta. If it includes enough details to actually reassure me, you could just tell the AI, “Do that.” Presumably what matters to us would include ‘the ability to affect our environment, eg by giving orders.’ What do you mean by “very powerful but insane”? I want to parse that as ‘intelligent in the sense of having accurate models that allow it to shape the future, but not programmed to do what matters to humans.’
“very powerful but insane” : AI’s response to orders seem to make less than no sense, yet AI is still able to do damage. “What matters to humans”: Things like the Outcome Pump example, where any child would know that not dying is supposed to be part of “out of the building”, but not including the problems that we are bad at solving, such as fun theory and the like.