I think that regardless of how we define “Friendly”, an advanced enough Friendly AGI might sometimes take actions that will be perceived as hostile by some humans (or even all humans).
This makes it much harder to distinguish the actions of:
rogue AGI
Friendly AGI that failed to preserve its Friendliness
I think that regardless of how we define “Friendly”, an advanced enough Friendly AGI might sometimes take actions that will be perceived as hostile by some humans (or even all humans).
This makes it much harder to distinguish the actions of:
rogue AGI
Friendly AGI that failed to preserve its Friendliness
Friendly AGI that remains to be Friendly