it seems to me that almost every “The AI is an unfriendly failure” story begin with “The Humans are wasting too many resources, which I can more efficiently use for something else.”
Really? I think the one I see most is “I am supposed to make humans happy, but they fight with each other and make themselves unhappy, so I must kill/enslave all of them”. At least in Hollywood. You may be looking in more interesting places.
Per your AI, does it have an obvious incentive to help people below the median energy level?
Really? I think the one I see most is “I am supposed to make humans happy, but they fight with each other and make themselves unhappy, so I must kill/enslave all of them”. At least in Hollywood. You may be looking in more interesting places.
To me, that seems like a very similar story, it’s just their wasting their energy on fighting/unhappiness. I just thought I’d attempt to make an AI that thinks “Human’s wasting energy? Under some caveats, I approve!”
Per your AI, does it have an obvious incentive to help people below the median energy level?
I made a quick sample population to run some numbers about incentives (8 people, using 100, 50, 25,13,6,3,2,1 energy, assuming only one unit of time) and ran some numbers to consider incentives.
The AI got around 5.8 utility from taking 50 energy from the top person, giving 10 energy to use to the bottom 4, and just assuming that the remaining 10 energy either went unused or was used as a transaction cost. However, the AI did also get about .58 more Utility from killing any of the four bottom people, (even assuming their energy vanished)
Of note, roughly doubling the size of everyone’s energy pie does get a greater amount of Utility then either of those two things (Roughly 10.2), except that they aren’t exclusive: You can double the Pie and also redistribute the Pie (and also kill people that would eat the pie in such a way to drag down the Median)
Here’s an even more bizzare note: When I quadrupled the population (giving the same distribution of energy to each people, so 100x4, 50x4, 25x4, 13x4, 6x4,3x4, 2x4, 1x4) The Algorithm gained plenty of additional utility. However, the amount of utility the algorithm gained by murdering the bottom person skyrocketed (to around 13.1) Because while it would still move the Median from 9.5 to 13, the Squareroot of that Median was multiplied by a much greater population than when Median was multiplied by a much greater population. So, if for some reason, the energy gap between the person right below the Median and the person right above the Median is large, the AI has a significant incentive to murder 1 person.
In fact, the way I set it up, the AI even has incentive to murder the bottom 9 people to get the Median up to 25.… but not very much, and each person it murders before the Median shifts is a substantial disutility. The AI would have gained more utility by just implementing the “Tax the 100′s” plan I gave earlier than instituting either of those two plans, but again, they aren’t exclusive.
I somehow got: Murder can be justified, but only of people below the median, and only in those cases where it Jukes the median sufficiently, and in general helping them by taking from people above the median is more effective, but you can do both.
Assuming a smoother distribution of energy expenditures in the population of 32 appeared to limit this problem from happening. Given a smoother energy expenditure, the median does not jitter by so much when a bottom person dies and Murdering bottom people goes back to causing disutility.
However, I have to admit that in terms of Novel ways an algorithm could fail, I did not see the above coming: I knew it was going to fail, but I didn’t realize it might also fail in such an oddly esoteric manner in addition to the obvious failure I already mentioned.
Thank you for encouraging me to look at this in more detail!
Note that killing people is not the only way to raise the median. Another technique is taking resources and redistributing them. The optimal first-level strategy is to only allow minimum-necessary-for-survival to those below the median (which, depending on what it thinks “survival” means, might include just freezing them, or cutting off all unnecessary body parts and feeding them barely nutritious glop while storing them in the dark), and distribute everything else equally between the rest.
Also, given this strategy, the median of human consumption is 2×R/(N-1), where R is the total amount of resources and N is the total amount of humans. The utility function then becomes sqrt(2×R/(N-1)) × N × T. Which means that for the same resources, its utility is maximized if the maximum number of people use them. Thus, the AI will spend its time finding the smallest possible increment above “minimum necessary for survival”, and maximize the number of people it can sustain, keeping (N-1)/2 people at the minimum and (N-1)/2+1 just a tiny bit above it, and making sure it does this for the longest possible time.
Really? I think the one I see most is “I am supposed to make humans happy, but they fight with each other and make themselves unhappy, so I must kill/enslave all of them”. At least in Hollywood. You may be looking in more interesting places.
Per your AI, does it have an obvious incentive to help people below the median energy level?
To me, that seems like a very similar story, it’s just their wasting their energy on fighting/unhappiness. I just thought I’d attempt to make an AI that thinks “Human’s wasting energy? Under some caveats, I approve!”
I made a quick sample population to run some numbers about incentives (8 people, using 100, 50, 25,13,6,3,2,1 energy, assuming only one unit of time) and ran some numbers to consider incentives.
The AI got around 5.8 utility from taking 50 energy from the top person, giving 10 energy to use to the bottom 4, and just assuming that the remaining 10 energy either went unused or was used as a transaction cost. However, the AI did also get about .58 more Utility from killing any of the four bottom people, (even assuming their energy vanished)
Of note, roughly doubling the size of everyone’s energy pie does get a greater amount of Utility then either of those two things (Roughly 10.2), except that they aren’t exclusive: You can double the Pie and also redistribute the Pie (and also kill people that would eat the pie in such a way to drag down the Median)
Here’s an even more bizzare note: When I quadrupled the population (giving the same distribution of energy to each people, so 100x4, 50x4, 25x4, 13x4, 6x4,3x4, 2x4, 1x4) The Algorithm gained plenty of additional utility. However, the amount of utility the algorithm gained by murdering the bottom person skyrocketed (to around 13.1) Because while it would still move the Median from 9.5 to 13, the Squareroot of that Median was multiplied by a much greater population than when Median was multiplied by a much greater population. So, if for some reason, the energy gap between the person right below the Median and the person right above the Median is large, the AI has a significant incentive to murder 1 person.
In fact, the way I set it up, the AI even has incentive to murder the bottom 9 people to get the Median up to 25.… but not very much, and each person it murders before the Median shifts is a substantial disutility. The AI would have gained more utility by just implementing the “Tax the 100′s” plan I gave earlier than instituting either of those two plans, but again, they aren’t exclusive.
I somehow got: Murder can be justified, but only of people below the median, and only in those cases where it Jukes the median sufficiently, and in general helping them by taking from people above the median is more effective, but you can do both.
Assuming a smoother distribution of energy expenditures in the population of 32 appeared to limit this problem from happening. Given a smoother energy expenditure, the median does not jitter by so much when a bottom person dies and Murdering bottom people goes back to causing disutility.
However, I have to admit that in terms of Novel ways an algorithm could fail, I did not see the above coming: I knew it was going to fail, but I didn’t realize it might also fail in such an oddly esoteric manner in addition to the obvious failure I already mentioned.
Thank you for encouraging me to look at this in more detail!
Note that killing people is not the only way to raise the median. Another technique is taking resources and redistributing them. The optimal first-level strategy is to only allow minimum-necessary-for-survival to those below the median (which, depending on what it thinks “survival” means, might include just freezing them, or cutting off all unnecessary body parts and feeding them barely nutritious glop while storing them in the dark), and distribute everything else equally between the rest.
Also, given this strategy, the median of human consumption is 2×R/(N-1), where R is the total amount of resources and N is the total amount of humans. The utility function then becomes sqrt(2×R/(N-1)) × N × T. Which means that for the same resources, its utility is maximized if the maximum number of people use them. Thus, the AI will spend its time finding the smallest possible increment above “minimum necessary for survival”, and maximize the number of people it can sustain, keeping (N-1)/2 people at the minimum and (N-1)/2+1 just a tiny bit above it, and making sure it does this for the longest possible time.