Uh, I am on the side of “a random AGI implementation is most likely not going to become friendly just because you sincerely try to befriend it”.
I am just saying that in the space of all possible AGIs there are a few that can be befriended, so your argument is not completely impossible, just statistically implausible (i.e. the priors are small but nonzero).
For example, you could accidentally create an AGI which has human-like instincts. It is possible. It just most likely will not happen. See also: Detached Lever Fallacy. An AGI that is friendly to you if you are friendly to it is less likely than an AGI that is friendly to you unconditionally, which in turn is less likely than an AGI which is unfriendly to you unconditionally.
Uh, I am on the side of “a random AGI implementation is most likely not going to become friendly just because you sincerely try to befriend it”.
I am just saying that in the space of all possible AGIs there are a few that can be befriended, so your argument is not completely impossible, just statistically implausible (i.e. the priors are small but nonzero).
For example, you could accidentally create an AGI which has human-like instincts. It is possible. It just most likely will not happen. See also: Detached Lever Fallacy. An AGI that is friendly to you if you are friendly to it is less likely than an AGI that is friendly to you unconditionally, which in turn is less likely than an AGI which is unfriendly to you unconditionally.