Have you considered evolution? This may be relevant to human selfishness. If there are n agents that have a chance of dying or reproducing (for argument’s sake, each reproduction creates a single descendant, and the original agent dies immediately, so as to avoid kin-altruism issues—ie everyone is a phoenix) ).
Then each agent has the ability to dedicate a certain amount of effort to increasing or decreasing their own, or the other agent’s, chances of survival (assume they are equally skilled at affecting anyone’s chances). The agents don’t interact in any other way, and have no goals. We start them off with a lot of different algorithms to make their decisions.
Then after running the system through a few reproductive cycles, the surviving agents will be the ones who either increase their own survival chances entirely (selfish agents) or a few small groups that boost each other’s chances.
But unless the agents in the small group are running complicated altruistic algorithms, the groups will be unstable: when one of them dies, the strategy they are following will be less optimal. And if there is any noise or imperfection in the system (you don’t know for sure who you’re helping, or you’re less good at helping other agents than yourself), the groups will also decay, leaving only the selfish agents.
It sounds like I might have skipped a few inferential steps in this post and/or chose a bad title. Yes, I’m assuming that if we are selfish, then evolution made us that way. The post starts at the followup question “if we are selfish, how might that selfishness be implemented as a decision procedure?” (i.e., how would you program selfishness into an AI?) and then considers “what implications does that have as to what our values actually are or should be?”
What I meant by my post is that starting with random preferences, those that we designate as selfish survive. So what we intuitively think of selfishness—me-first, a utility function with an index pointing to myself—arises naturally from non-indexical starting points (evolving agents with random preferences).
If it arose this way, then it is less mysterious as to what it is, and we could start looking at evolutionary stable decision theories or suchlike. You don’t even have to have evolution, simply “these are preferences that would be advantageous should the AI be subject to evolutionary pressure”.
Have you considered evolution? This may be relevant to human selfishness. If there are n agents that have a chance of dying or reproducing (for argument’s sake, each reproduction creates a single descendant, and the original agent dies immediately, so as to avoid kin-altruism issues—ie everyone is a phoenix) ).
Then each agent has the ability to dedicate a certain amount of effort to increasing or decreasing their own, or the other agent’s, chances of survival (assume they are equally skilled at affecting anyone’s chances). The agents don’t interact in any other way, and have no goals. We start them off with a lot of different algorithms to make their decisions.
Then after running the system through a few reproductive cycles, the surviving agents will be the ones who either increase their own survival chances entirely (selfish agents) or a few small groups that boost each other’s chances.
But unless the agents in the small group are running complicated altruistic algorithms, the groups will be unstable: when one of them dies, the strategy they are following will be less optimal. And if there is any noise or imperfection in the system (you don’t know for sure who you’re helping, or you’re less good at helping other agents than yourself), the groups will also decay, leaving only the selfish agents.
It sounds like I might have skipped a few inferential steps in this post and/or chose a bad title. Yes, I’m assuming that if we are selfish, then evolution made us that way. The post starts at the followup question “if we are selfish, how might that selfishness be implemented as a decision procedure?” (i.e., how would you program selfishness into an AI?) and then considers “what implications does that have as to what our values actually are or should be?”
What I meant by my post is that starting with random preferences, those that we designate as selfish survive. So what we intuitively think of selfishness—me-first, a utility function with an index pointing to myself—arises naturally from non-indexical starting points (evolving agents with random preferences).
If it arose this way, then it is less mysterious as to what it is, and we could start looking at evolutionary stable decision theories or suchlike. You don’t even have to have evolution, simply “these are preferences that would be advantageous should the AI be subject to evolutionary pressure”.