The_Jaded_One comments on Superintelligence 14: Motivation selection methods

The_Jaded_One 20 Dec 2014 21:33 UTC
0 points
When we jump from direct normativity to indirect normativity, it is reasonably claimed that we gain a lot.

I sometimes wonder whether the issue of indirect normativity has been pushed far enough. The limiting case is that there is some way to specify, in “machine comprehensible” terms, that a software intelligence should “do what I want”.

“outsourcing the hard intellectual work to the AI”

Just how much can be outsourced?

Could you program a software intelligence to go read books like Superintelligence, understand the concept of “Friendliness” or “Motivational alignment”, and then be friendly/motivationally aligned with yourself?

And couldn’t the problem of selecting a method to compromise between the billions of different axiologies of the humans on this planet be outsourced to the AI by telling it to motivationally align with the team of designers and their backers, subject to whatever compromises would have been made had the team tried to directly specify values? This is not to say I am advocating a post-singleton world run purely for the benefit of the design team/project, but that if such a team (or individual) were already committed to trying to design something like “The CEV of humanity”, then an AI which was motivationally aligned with them would continue that task more quickly and safely.

Anyway, I think there is a fruitful discussion to be had thinking about how the maximum amount of work can be offloaded to the AI; perhaps work on friendly AI should be though of as that part of the motivational alignment problem that simply has to be done by the human(s).