This is not how bounded utility functions work. The fact it’s bounded doesn’t mean it reaches a perfect “plateau” at some point. It can approach its upper bound asymptotically. For example, a bounded paperclip maximizer can use the utility function 1 - exp(-N / N0) where N is the “number of paperclips in the universe” and N0 is a constant.
OK, here’s a reason to be against a utility function like the one you describe. Let’s use H to indicate the number of happily living humans in the universe. Let’s say that my utility function has some very high threshold T of happy lives such that as H increases past T, although the function does continue to increase monotonically, it’s still not very far above the value it takes on at point T. Now supposed civilization has 2T people living in it. There’s a civilization-wide threat. The civilization-wide threat is not very serious, however. Specifically, with probability 0.00001 it will destroy all the colonized worlds in the universe except for 4. However, there’s a measure suggested that will completely mitigate the threat. This measure will require sacrificing the lives of half of the civilization’s population, bringing the population all the way back down to T. If I understand your proposal correctly, an agent operating under your proposed utility function would choose to take this measure, since a world with T people does not have an appreciably lower utility than a world with 2T people, relative to a tiny risk of an actual dent in the proposed utility function taking place.
To put it more succinctly, an agent operating under this utility function with total utilitarianism that was running an extremely large civilization would have no qualms throwing marginal lives away to mitigate remotely unlikely civilizational threats. We could replace the Pascal’s Mugging scenario with a Pascal’s Murder scenario: I don’t like Joe, so I go tell the AI running things that Joe has an un-foilable plan to infect the extremely well-defended interstellar internet with a virus that will bring humanity back to the stone ages (and that furthermore, Joe’s hacking skills and defenses are so comprehensive that the only way to deal with Joe is to kill him immediately, as opposed to doing further investigation or taking a humane measure like putting him in prison). Joe’s life is such an unappreciable dent in the AI’s utility function compared to the loss of all civilization that the AI complies with my request. Boom—everyone in society has a button that lets them kill arbitrary people instantly.
It’s possible that you could (e.g.) ditch total utilitarianism and construct your utility function in some other clever way to avoid this problem. I’m just trying to demonstrate that it’s not obviously a bulletproof solution to the problem.
It seems to me that any reasoning, be it with bounded or unbounded utility will support avoiding unlikely civilizational threats at the expense of small number of lives for sufficiently large civilizations. I don’t see anything wrong with that (in particular I don’t think it leads to mass murder since that would have a significant utility cost).
There is a different related problem, namely that if the utility function saturates around (say) 10^10 people and our civilization has 10^20 people, then the death of everyone except some 10^15 people will be acceptable to prevent a an event killing everyone except some 10^8 people with much lower probability. However, this effect disappears once we sum over all possible universes weighted by the Solomonoff measure as we should (like done here). Effectively it normalizes the utility function to saturate at the actual capacity of the multiverse.
This is not how bounded utility functions work. The fact it’s bounded doesn’t mean it reaches a perfect “plateau” at some point. It can approach its upper bound asymptotically. For example, a bounded paperclip maximizer can use the utility function 1 - exp(-N / N0) where N is the “number of paperclips in the universe” and N0 is a constant.
OK, here’s a reason to be against a utility function like the one you describe. Let’s use H to indicate the number of happily living humans in the universe. Let’s say that my utility function has some very high threshold T of happy lives such that as H increases past T, although the function does continue to increase monotonically, it’s still not very far above the value it takes on at point T. Now supposed civilization has 2T people living in it. There’s a civilization-wide threat. The civilization-wide threat is not very serious, however. Specifically, with probability 0.00001 it will destroy all the colonized worlds in the universe except for 4. However, there’s a measure suggested that will completely mitigate the threat. This measure will require sacrificing the lives of half of the civilization’s population, bringing the population all the way back down to T. If I understand your proposal correctly, an agent operating under your proposed utility function would choose to take this measure, since a world with T people does not have an appreciably lower utility than a world with 2T people, relative to a tiny risk of an actual dent in the proposed utility function taking place.
To put it more succinctly, an agent operating under this utility function with total utilitarianism that was running an extremely large civilization would have no qualms throwing marginal lives away to mitigate remotely unlikely civilizational threats. We could replace the Pascal’s Mugging scenario with a Pascal’s Murder scenario: I don’t like Joe, so I go tell the AI running things that Joe has an un-foilable plan to infect the extremely well-defended interstellar internet with a virus that will bring humanity back to the stone ages (and that furthermore, Joe’s hacking skills and defenses are so comprehensive that the only way to deal with Joe is to kill him immediately, as opposed to doing further investigation or taking a humane measure like putting him in prison). Joe’s life is such an unappreciable dent in the AI’s utility function compared to the loss of all civilization that the AI complies with my request. Boom—everyone in society has a button that lets them kill arbitrary people instantly.
It’s possible that you could (e.g.) ditch total utilitarianism and construct your utility function in some other clever way to avoid this problem. I’m just trying to demonstrate that it’s not obviously a bulletproof solution to the problem.
It seems to me that any reasoning, be it with bounded or unbounded utility will support avoiding unlikely civilizational threats at the expense of small number of lives for sufficiently large civilizations. I don’t see anything wrong with that (in particular I don’t think it leads to mass murder since that would have a significant utility cost).
There is a different related problem, namely that if the utility function saturates around (say) 10^10 people and our civilization has 10^20 people, then the death of everyone except some 10^15 people will be acceptable to prevent a an event killing everyone except some 10^8 people with much lower probability. However, this effect disappears once we sum over all possible universes weighted by the Solomonoff measure as we should (like done here). Effectively it normalizes the utility function to saturate at the actual capacity of the multiverse.