My model is that friendship is one particular strategy for alliance-formation that happened to evolve in humans. I expect this is natural in the sense of being a local optimum (in the ancestral environment), but probably not in the sense of being simple to formally define or implement.
I think friendship is substantially more complicated than “I care some about your utility function”. For instance, you probably stop valuing their utility function if they betray you (friendship can “break”). I also think the friendship algorithm includes a bunch of signalling to help with coordination (so that you understand the other person is trying to be friends), and some less-pleasant stuff like evaluations of how valuable an ally the other person is and how the friendship will affect your social standing.
Friendship also appears to include some sort of check that the other person is making friendship-related-decisions using system 1 instead of system 2--possibly as a security feature to make it harder for people to consciously exploit (with the unfortunate side-effect that we penalize system-2-thinkers even when they sincerely want to be allies), or possibly just because the signalling parts evolved for system 1 and don’t generalize properly.
(One could claim that “the true spirit of friendship” is loving someone unconditionally or something, and that might be simple, but I don’t think that’s what humans actually implement.)
One could claim that “the true spirit of friendship” is loving someone unconditionally or something, and that might be simple, but I don’t think that’s what humans actually implement.
Yeah, I agree that humans implement something more complex. But it is what we want AI to implement, isn’t it? And it looks like may be quite natural abstraction to have.
(But again, it’s useless while we don’t know how to direct AI to the specific abstraction.)
Then we’re no longer talking about “the way humans care about their friends”, we’re inventing new hypothetical algorithms that we might like our AIs to use. Humans no longer provide an example of how that behavior could arise naturally in an evolved organism, nor a case study of how it works out for people to behave that way.
My model is that friendship is one particular strategy for alliance-formation that happened to evolve in humans. I expect this is natural in the sense of being a local optimum (in the ancestral environment), but probably not in the sense of being simple to formally define or implement.
I think friendship is substantially more complicated than “I care some about your utility function”. For instance, you probably stop valuing their utility function if they betray you (friendship can “break”). I also think the friendship algorithm includes a bunch of signalling to help with coordination (so that you understand the other person is trying to be friends), and some less-pleasant stuff like evaluations of how valuable an ally the other person is and how the friendship will affect your social standing.
Friendship also appears to include some sort of check that the other person is making friendship-related-decisions using system 1 instead of system 2--possibly as a security feature to make it harder for people to consciously exploit (with the unfortunate side-effect that we penalize system-2-thinkers even when they sincerely want to be allies), or possibly just because the signalling parts evolved for system 1 and don’t generalize properly.
(One could claim that “the true spirit of friendship” is loving someone unconditionally or something, and that might be simple, but I don’t think that’s what humans actually implement.)
Yeah, I agree that humans implement something more complex. But it is what we want AI to implement, isn’t it? And it looks like may be quite natural abstraction to have.
(But again, it’s useless while we don’t know how to direct AI to the specific abstraction.)
Then we’re no longer talking about “the way humans care about their friends”, we’re inventing new hypothetical algorithms that we might like our AIs to use. Humans no longer provide an example of how that behavior could arise naturally in an evolved organism, nor a case study of how it works out for people to behave that way.