The big reason why humans are cosmopolitan might be that we evolved in multipolar environments, where helping others is instrumental. If so, just training AIs in multipolar environments that incentivize cooperation could be all it takes to get some amount of instrumental-made-terminal-by-optimization-failure cosmopolitanism.
Just noting the risk that the AIs could learn verifiable cooperation/coordination rather than kindness. This would probably be incentivized by the training (“you don’t profit from being nice to a cooperate-rock”), and could easily cut humans out of the trades that AI make with one another.
AIs could learn to cooperate with perfect selfishness, but humans and AIs usually learn easier to compute heuristics / “value shards” early in training, which persist to some extent after the agent discovers the true optimal policy, although reflection or continued training could stamp out the value shards later.
The big reason why humans are cosmopolitan might be that we evolved in multipolar environments, where helping others is instrumental. If so, just training AIs in multipolar environments that incentivize cooperation could be all it takes to get some amount of instrumental-made-terminal-by-optimization-failure cosmopolitanism.
Just noting the risk that the AIs could learn verifiable cooperation/coordination rather than kindness. This would probably be incentivized by the training (“you don’t profit from being nice to a cooperate-rock”), and could easily cut humans out of the trades that AI make with one another.
AIs could learn to cooperate with perfect selfishness, but humans and AIs usually learn easier to compute heuristics / “value shards” early in training, which persist to some extent after the agent discovers the true optimal policy, although reflection or continued training could stamp out the value shards later.
maybe, but if the ai is playing a hard competitive game it will directly learn to be destructively ruthless