This fails either when agents aren’t totally selfish (if, like you, they’re looking for what’s optimal for everyone, which is a very different problem)
It’s not very different—you just need to alter the agents’ utility functions slightly, to value the other player gaining utility as well.
E.g. take the standard Prisoner’s Dilemma: 5 for me, 0 for you if I can betray you, 3 each if we cooperate, 1 each if we defect. The equilibrium is defect / defect. Now let’s make our agents’ utility function look altruistic—each agent gets a reward, and then gets to add the opponent’s reward as well (no loops, just add at the end.) Now our payoff is 5+0 for me, 0+5 for you if I betray you, 3+3 each if we cooperate, and 1+1 each if we defect.
A purely self-interested agent with that utility function has the equilibrium at cooperate / cooperate.
More generally, the math in games involved selfish players goes through if you represent their altruism in their own utility function, so they can act still simply pick the highest number ‘selfishly’ .
It’s not very different—you just need to alter the agents’ utility functions slightly, to value the other player gaining utility as well.
E.g. take the standard Prisoner’s Dilemma: 5 for me, 0 for you if I can betray you, 3 each if we cooperate, 1 each if we defect. The equilibrium is defect / defect. Now let’s make our agents’ utility function look altruistic—each agent gets a reward, and then gets to add the opponent’s reward as well (no loops, just add at the end.) Now our payoff is 5+0 for me, 0+5 for you if I betray you, 3+3 each if we cooperate, and 1+1 each if we defect.
A purely self-interested agent with that utility function has the equilibrium at cooperate / cooperate.
More generally, the math in games involved selfish players goes through if you represent their altruism in their own utility function, so they can act still simply pick the highest number ‘selfishly’ .