Rather than unfriendly AI, I think he means a Friendly AI that’s only Friendly to one person (or very few people). If we’re going to be talking about this concept then we need a better term for it. My inner nerd prefers Suzumiya AI.
Conceptually there is very little difference between AGI which understands the values of one human and AGI which understands a hypothetical aggregate of human values. Therefore I use FAI to refer to both concepts.
Conceptually there is very little difference between AGI which understands the values of one human and AGI which understands a hypothetical aggregate of human values.
The big difference is in how the AI builds that aggregate from individual values. There are surely many ways to do so, and people will disagree about them (and some ways will not be wanted by anyone at all but we could get them by mistake for a non-Friendly AI). On the other hand, existing suggestions like CEV are greatly underspecified and haven’t even been strictly proven to have real (or unique) solutions.
I disagree. Why couldn’t outlined procedure for creating friendly AI (3.4.4 of link) to be based on a one individual, a superposition of a small group of individuals instead of a superposition of all of humanity?
Rather than unfriendly AI, I think he means a Friendly AI that’s only Friendly to one person (or very few people). If we’re going to be talking about this concept then we need a better term for it. My inner nerd prefers Suzumiya AI.
Genie AI?
Conceptually there is very little difference between AGI which understands the values of one human and AGI which understands a hypothetical aggregate of human values. Therefore I use FAI to refer to both concepts.
The big difference is in how the AI builds that aggregate from individual values. There are surely many ways to do so, and people will disagree about them (and some ways will not be wanted by anyone at all but we could get them by mistake for a non-Friendly AI). On the other hand, existing suggestions like CEV are greatly underspecified and haven’t even been strictly proven to have real (or unique) solutions.
That apparently conflicts with how it is defined in Creating Friendly AI.
I disagree. Why couldn’t outlined procedure for creating friendly AI (3.4.4 of link) to be based on a one individual, a superposition of a small group of individuals instead of a superposition of all of humanity?
The preference of particular individual humans may involve harming other humans.
Defining such preferences as “friendly” violates the concept as it was originallly intended, IMO.