I think that regardless of how we define “Friendly”, an advanced enough Friendly AGI might sometimes take actions that will be perceived as hostile by some humans (or even all humans).
This makes it much harder to distinguish the actions of:
rogue AGI
Friendly AGI that failed to preserve its Friendliness
This strikes me as a purely semantic question regarding what goals are consistent with an agent qualifying as “friendly”.
I think that regardless of how we define “Friendly”, an advanced enough Friendly AGI might sometimes take actions that will be perceived as hostile by some humans (or even all humans).
This makes it much harder to distinguish the actions of:
rogue AGI
Friendly AGI that failed to preserve its Friendliness
Friendly AGI that remains to be Friendly