Just a sidenote, but IMO the solution to Pascal mugging is simply using a bounded utility function. I don’t understand why people insist on unboundedness.
Rejecting components with extremely low probabilities even if they have extremely large utilities works better, avoids selection bias and is also what common sense would tell you to do. I think of it as “ignore noise”. At least it works for humans, an AGI may have to do something else.
I don’t understand in what sense it works better or the relation to selection bias. Ignoring components with low probability is a good heuristic since you have bounded computing resources. However it seems reasonable to require your decision theory to deal with extreme cases in which you can spare the computing resource to take these components into account. On the other hand, I agree that it is rarely practical for humans to emulate perfect Bayesian reasoners.
Selection bias: when you are presented with a specific mugging scenario, you ought to realize that there are many more extremely unlikely scenarios where the payoff is just as high, so selecting just one of them to act on is suboptimal.
As for the level at which to stop calculating, bounded computational power is a good heuristic. But I suspect that there is a better way to detect the cutoff (it is known as infrared cutoff in physics). If you calculate the number of choices vs their (log) probability, once you go low enough, the number of choices explodes. My guess is that for reasonably low probabilities you would get exponential increase for the number of outcomes, but for very low probabilities the growth in the number of outcomes becomes super-exponential. This is, of course, a speculation, I would love to see some calculation or modeling, but this is what my intuition tells me.
OK, I have thought about it some more. The issue is how accurately one can evaluate the probabilities. If the best you can do is, say, 1%, then you are forced to count even the potentially very unlikely possibilities at 1% odds. The accuracy of the probability estimates would depend on something like the depth of your Solomonoff induction engine. If you are confronted with a Pascal’s mugger and your induction engine returns “the string required to model the mugger as honest and capable of carrying out the threat is longer than the longest algorithm I can process”, you are either forced to use the probability corresponding to the longest string, or to discard the hypothesis outright. What I am saying is that the latter is better than the former.
If you are confronted with a Pascal’s mugger and your induction engine returns “the string required to model the mugger as honest and capable of carrying out the threat is longer than the longest algorithm I can process”, you are either forced to use the probability corresponding to the longest string, or to discard the hypothesis outright.
The primary problem with Pascal’s Mugging is that the Mugging string is short and easy to evaluate. 3^^^3 is a big number; it implies a very low probability but not necessarily 1 / 3^^^3; so just how outrageous can a mugging be without being discounted for low probability? That least-likely-but-still-manageable Mugging will still get you. If you’re allowed to reason about descriptions of utility and not just shut up and multiply to evaluate the utility of simulated worlds then in the worst case you have to worry about the Mugger that offers you BusyBeaver(N) utility, where 2^-N is the lowest probability that you can process. BusyBeaver(N) is well-defined, although uncomputable, it is at least as large as any other function of length N. Unfortunately that means BusyBeaver(N) 2^-N > C, for some N-bit constant C, or in other words EU(Programs-of-length-N) is O(BusyBeaver(N)). It doesn’t matter what the mugger offers, or if you mug yourself. Any N-bit utility calculation program has expected utility O(BB(N)) because it might* yield BB(N) utility.
The best not-strictly-bounded-utility solution I have against this is discounting the probability of programs as a function of their running time as well as their length. Let 1/R be the probability that any given step of a process will cause it to completely fail as opposed to halting with output or never halting. Solomonoff Induction can be redefined as the sum over programs, P, producing an output S in N steps, of 2^-Length(P) x (R − 1 / R)^N. It is possible to compute a prior probability with error less than B, for any sequence S and finite R, by enumerating all programs shorter than log_2(1 / B) bits that halt in fewer than ~R / B steps. All un-enumerated programs have cumulative probability less than B of generating S.
For Pascal’s Mugging it suffices to determine B based on the number of steps required to, e.g. simulate 3^^^3 humans. 3^^^3 ~= R / B, so either the prior probability of the Mugger being honest is infinitesimal, or it is infinitesimally unlikely that the universe will last fewer than the minimum 3^^^3 Planck Units necessary to implement the mugging. Given some evidence about the expected lifetime of the Universe, the mugging can be rejected.
The biggest advantage of this method over a fixed bounded utility function is that R is parameterized by the agent’s evidence about its environment, and can change with time. The longer a computer successfully runs an algorithm, the larger the expected value of R.
OK, I guess I understand your argument that the mugger can construct an algorithm producing very high utility using only N bits. Or that I can construct a whole whack of similar algorithms in response. And end up unable to do anything because of the forest of low-probability high-utility choices. Which are known to be present if only you spend enough time looking for them. So that’s why you suggest limiting not (only) the number of states, but (also) the number of steps. I wonder what Eliezer and others think about that.
This may be an impossible task, but can you elaborate on your intuition? My intuitions, such as they are, are Solomonoffy and do not predict super-exponential growth.
I don’t understand why people insist on unboundedness.
Possibly because there do appear to be potential solutions to Pascal’s mugging that do not require bounding your utility function.
Example:
I claim that in general, I find it reasonable to submit and give the Pascal’s mugger the money, and I am being Pascal’s mugged and considering giving a Pascal’s mugger money.
I also consider: What is the chance that a future mugger will make a Pascal’s mugging with a higher level of super exponentiation and that I won’t be able to pay?
And I claim that the answer appears to be: terribly unlikely, but considering the risks of failing at a higher level of super exponentiation, likely enough that I shouldn’t submit to the current Pascal’s mugger.
Except, that’s ALSO true for the next Pascal’s mugger.
So despite believing in Pascal’s mugging, I act exactly as if I don’t, and claiming that I ‘believe’ in Pascal’s mugging, doesn’t actually pay rent (for the muggers)
End of Example.
Since there exist examples like this and others that appear to solve Pascal’s mugging without requiring a bounded utility function, a lot of people wouldn’t want to accept a utility bound just because of the mugging.
I’m afraid that this kind of reasoning cannot avoid the real underlying problem, namely that Solomonoff expectation values of unbounded utility functions tend to divergence since utility grows as BB(n) and probability falls only as 2^{-n}.
What if the utility function is bound but and the bound itself is expandable without limit in at least some cases?
For instance, take a hypothetical utility function, Coinflipper bot.
Coinflipper bot has utility equal to the number of fair coins it has flipped.
Coinflipper bot has a utility bound equal to the 2^(greatest number of consecutive heads on fair coins it has flipped+1)
For instance, a particular example of Coinflipper bot might have flipped 512 fair coins and it’s current record is 10 consecutive heads on fair coins, so it’s utility is 512 and it’s utility bound is 2^(10+1) or 2048.
On the other hand, a different instance of Coinflipper bot might have flipped 2 fair coins, gotten 2 tails, and have a utility of 2 and a utility bound of 2^(0+1)=2.
How would the math work out in that kind of situation?
What if, depending on other circumstances(say the flip of a fair coin), your utility function can take values in either a finite(if heads) or infinite(if tails) interval?
Would that entire situation be bounded, unbounded, neither, or is my previous question ill posed?
Maybe you bound your utility function so you treat all universes that produce 100 billion DALY’s/year as identical. But then you learn that the galaxy can support way more than 100 billion humans. Left with a bounded utility function you’re unable to make good decisions at a civilizational scale.
This is not how bounded utility functions work. The fact it’s bounded doesn’t mean it reaches a perfect “plateau” at some point. It can approach its upper bound asymptotically. For example, a bounded paperclip maximizer can use the utility function 1 - exp(-N / N0) where N is the “number of paperclips in the universe” and N0 is a constant.
OK, here’s a reason to be against a utility function like the one you describe. Let’s use H to indicate the number of happily living humans in the universe. Let’s say that my utility function has some very high threshold T of happy lives such that as H increases past T, although the function does continue to increase monotonically, it’s still not very far above the value it takes on at point T. Now supposed civilization has 2T people living in it. There’s a civilization-wide threat. The civilization-wide threat is not very serious, however. Specifically, with probability 0.00001 it will destroy all the colonized worlds in the universe except for 4. However, there’s a measure suggested that will completely mitigate the threat. This measure will require sacrificing the lives of half of the civilization’s population, bringing the population all the way back down to T. If I understand your proposal correctly, an agent operating under your proposed utility function would choose to take this measure, since a world with T people does not have an appreciably lower utility than a world with 2T people, relative to a tiny risk of an actual dent in the proposed utility function taking place.
To put it more succinctly, an agent operating under this utility function with total utilitarianism that was running an extremely large civilization would have no qualms throwing marginal lives away to mitigate remotely unlikely civilizational threats. We could replace the Pascal’s Mugging scenario with a Pascal’s Murder scenario: I don’t like Joe, so I go tell the AI running things that Joe has an un-foilable plan to infect the extremely well-defended interstellar internet with a virus that will bring humanity back to the stone ages (and that furthermore, Joe’s hacking skills and defenses are so comprehensive that the only way to deal with Joe is to kill him immediately, as opposed to doing further investigation or taking a humane measure like putting him in prison). Joe’s life is such an unappreciable dent in the AI’s utility function compared to the loss of all civilization that the AI complies with my request. Boom—everyone in society has a button that lets them kill arbitrary people instantly.
It’s possible that you could (e.g.) ditch total utilitarianism and construct your utility function in some other clever way to avoid this problem. I’m just trying to demonstrate that it’s not obviously a bulletproof solution to the problem.
It seems to me that any reasoning, be it with bounded or unbounded utility will support avoiding unlikely civilizational threats at the expense of small number of lives for sufficiently large civilizations. I don’t see anything wrong with that (in particular I don’t think it leads to mass murder since that would have a significant utility cost).
There is a different related problem, namely that if the utility function saturates around (say) 10^10 people and our civilization has 10^20 people, then the death of everyone except some 10^15 people will be acceptable to prevent a an event killing everyone except some 10^8 people with much lower probability. However, this effect disappears once we sum over all possible universes weighted by the Solomonoff measure as we should (like done here). Effectively it normalizes the utility function to saturate at the actual capacity of the multiverse.
And the utility function doesn’t have to be bounded by a constant. An agent will “blow out its speakers” if it follows a utility function whose dynamic range is greater than the agent’s Kolmogorov complexity + the evidence the agent has accumulated in its lifetime. The agent’s brain’s subjective probabilities will not have sufficient fidelity for such a dynamic utility function to be meaningful.
Super-exponential utility values are OK, if you’ve accumulated a super-polynomial amount of evidence.
The Solomonoff expectation value of any unbounded computable utility function diverges. This is because the program “produce the first universe with utility > n^2” is roughly of length log n + O(1) therefore it contributes 2^{-log n + O(1)} n^2 = O(n) to the expectation value.
You are thinking in terms of numbers (never gets worse than −1000), when what matters is outcomes. Your finite infimum would have to represent some event that you could describe or else it would have no meaning in ordinal utility terms. (At least I think this is how it works.)
Why? Suppose my utility function is 1/(number of staples in the universe). The infimum of my utility function would be for there to be infinitely many staples in the universe, which I cannot describe without using infinities.
Just a sidenote, but IMO the solution to Pascal mugging is simply using a bounded utility function. I don’t understand why people insist on unboundedness.
Rejecting components with extremely low probabilities even if they have extremely large utilities works better, avoids selection bias and is also what common sense would tell you to do. I think of it as “ignore noise”. At least it works for humans, an AGI may have to do something else.
I don’t understand in what sense it works better or the relation to selection bias. Ignoring components with low probability is a good heuristic since you have bounded computing resources. However it seems reasonable to require your decision theory to deal with extreme cases in which you can spare the computing resource to take these components into account. On the other hand, I agree that it is rarely practical for humans to emulate perfect Bayesian reasoners.
Selection bias: when you are presented with a specific mugging scenario, you ought to realize that there are many more extremely unlikely scenarios where the payoff is just as high, so selecting just one of them to act on is suboptimal.
As for the level at which to stop calculating, bounded computational power is a good heuristic. But I suspect that there is a better way to detect the cutoff (it is known as infrared cutoff in physics). If you calculate the number of choices vs their (log) probability, once you go low enough, the number of choices explodes. My guess is that for reasonably low probabilities you would get exponential increase for the number of outcomes, but for very low probabilities the growth in the number of outcomes becomes super-exponential. This is, of course, a speculation, I would love to see some calculation or modeling, but this is what my intuition tells me.
There can’t be more than 2^n outcomes each with probability 2^-n.
OK, I have thought about it some more. The issue is how accurately one can evaluate the probabilities. If the best you can do is, say, 1%, then you are forced to count even the potentially very unlikely possibilities at 1% odds. The accuracy of the probability estimates would depend on something like the depth of your Solomonoff induction engine. If you are confronted with a Pascal’s mugger and your induction engine returns “the string required to model the mugger as honest and capable of carrying out the threat is longer than the longest algorithm I can process”, you are either forced to use the probability corresponding to the longest string, or to discard the hypothesis outright. What I am saying is that the latter is better than the former.
The primary problem with Pascal’s Mugging is that the Mugging string is short and easy to evaluate. 3^^^3 is a big number; it implies a very low probability but not necessarily 1 / 3^^^3; so just how outrageous can a mugging be without being discounted for low probability? That least-likely-but-still-manageable Mugging will still get you. If you’re allowed to reason about descriptions of utility and not just shut up and multiply to evaluate the utility of simulated worlds then in the worst case you have to worry about the Mugger that offers you BusyBeaver(N) utility, where 2^-N is the lowest probability that you can process. BusyBeaver(N) is well-defined, although uncomputable, it is at least as large as any other function of length N. Unfortunately that means BusyBeaver(N) 2^-N > C, for some N-bit constant C, or in other words EU(Programs-of-length-N) is O(BusyBeaver(N)). It doesn’t matter what the mugger offers, or if you mug yourself. Any N-bit utility calculation program has expected utility O(BB(N)) because it might* yield BB(N) utility.
The best not-strictly-bounded-utility solution I have against this is discounting the probability of programs as a function of their running time as well as their length. Let 1/R be the probability that any given step of a process will cause it to completely fail as opposed to halting with output or never halting. Solomonoff Induction can be redefined as the sum over programs, P, producing an output S in N steps, of 2^-Length(P) x (R − 1 / R)^N. It is possible to compute a prior probability with error less than B, for any sequence S and finite R, by enumerating all programs shorter than log_2(1 / B) bits that halt in fewer than ~R / B steps. All un-enumerated programs have cumulative probability less than B of generating S.
For Pascal’s Mugging it suffices to determine B based on the number of steps required to, e.g. simulate 3^^^3 humans. 3^^^3 ~= R / B, so either the prior probability of the Mugger being honest is infinitesimal, or it is infinitesimally unlikely that the universe will last fewer than the minimum 3^^^3 Planck Units necessary to implement the mugging. Given some evidence about the expected lifetime of the Universe, the mugging can be rejected.
The biggest advantage of this method over a fixed bounded utility function is that R is parameterized by the agent’s evidence about its environment, and can change with time. The longer a computer successfully runs an algorithm, the larger the expected value of R.
OK, I guess I understand your argument that the mugger can construct an algorithm producing very high utility using only N bits. Or that I can construct a whole whack of similar algorithms in response. And end up unable to do anything because of the forest of low-probability high-utility choices. Which are known to be present if only you spend enough time looking for them. So that’s why you suggest limiting not (only) the number of states, but (also) the number of steps. I wonder what Eliezer and others think about that.
You are right, of course. I have to rethink and rephrase it.
This may be an impossible task, but can you elaborate on your intuition? My intuitions, such as they are, are Solomonoffy and do not predict super-exponential growth.
I tried to elaborate in another comment, suggesting that we should reject all hypotheses more complicated than our Solomonoff engine can handle.
I probably agree in practice, but let’s see if there’s other avenues first...
Possibly because there do appear to be potential solutions to Pascal’s mugging that do not require bounding your utility function.
Example:
I claim that in general, I find it reasonable to submit and give the Pascal’s mugger the money, and I am being Pascal’s mugged and considering giving a Pascal’s mugger money.
I also consider: What is the chance that a future mugger will make a Pascal’s mugging with a higher level of super exponentiation and that I won’t be able to pay?
And I claim that the answer appears to be: terribly unlikely, but considering the risks of failing at a higher level of super exponentiation, likely enough that I shouldn’t submit to the current Pascal’s mugger.
Except, that’s ALSO true for the next Pascal’s mugger.
So despite believing in Pascal’s mugging, I act exactly as if I don’t, and claiming that I ‘believe’ in Pascal’s mugging, doesn’t actually pay rent (for the muggers)
End of Example.
Since there exist examples like this and others that appear to solve Pascal’s mugging without requiring a bounded utility function, a lot of people wouldn’t want to accept a utility bound just because of the mugging.
I’m afraid that this kind of reasoning cannot avoid the real underlying problem, namely that Solomonoff expectation values of unbounded utility functions tend to divergence since utility grows as BB(n) and probability falls only as 2^{-n}.
What if the utility function is bound but and the bound itself is expandable without limit in at least some cases?
For instance, take a hypothetical utility function, Coinflipper bot.
Coinflipper bot has utility equal to the number of fair coins it has flipped.
Coinflipper bot has a utility bound equal to the 2^(greatest number of consecutive heads on fair coins it has flipped+1)
For instance, a particular example of Coinflipper bot might have flipped 512 fair coins and it’s current record is 10 consecutive heads on fair coins, so it’s utility is 512 and it’s utility bound is 2^(10+1) or 2048.
On the other hand, a different instance of Coinflipper bot might have flipped 2 fair coins, gotten 2 tails, and have a utility of 2 and a utility bound of 2^(0+1)=2.
How would the math work out in that kind of situation?
I don’t understand what you mean by “utility bound”. A bounded utility function is just a function which takes values in a finite interval.
Let me try rephrasing this a bit.
What if, depending on other circumstances(say the flip of a fair coin), your utility function can take values in either a finite(if heads) or infinite(if tails) interval?
Would that entire situation be bounded, unbounded, neither, or is my previous question ill posed?
If you use a bounded utility function, it will inevitably be saturated by unlikely but high-utility possibilities, rendering it useless.
For any possible world W, |P(W) BoundedUtility(W)| < |P(W) UnboundedUtility(W)| as P(W) goes to zero.
Why?
Maybe you bound your utility function so you treat all universes that produce 100 billion DALY’s/year as identical. But then you learn that the galaxy can support way more than 100 billion humans. Left with a bounded utility function you’re unable to make good decisions at a civilizational scale.
This is not how bounded utility functions work. The fact it’s bounded doesn’t mean it reaches a perfect “plateau” at some point. It can approach its upper bound asymptotically. For example, a bounded paperclip maximizer can use the utility function 1 - exp(-N / N0) where N is the “number of paperclips in the universe” and N0 is a constant.
OK, here’s a reason to be against a utility function like the one you describe. Let’s use H to indicate the number of happily living humans in the universe. Let’s say that my utility function has some very high threshold T of happy lives such that as H increases past T, although the function does continue to increase monotonically, it’s still not very far above the value it takes on at point T. Now supposed civilization has 2T people living in it. There’s a civilization-wide threat. The civilization-wide threat is not very serious, however. Specifically, with probability 0.00001 it will destroy all the colonized worlds in the universe except for 4. However, there’s a measure suggested that will completely mitigate the threat. This measure will require sacrificing the lives of half of the civilization’s population, bringing the population all the way back down to T. If I understand your proposal correctly, an agent operating under your proposed utility function would choose to take this measure, since a world with T people does not have an appreciably lower utility than a world with 2T people, relative to a tiny risk of an actual dent in the proposed utility function taking place.
To put it more succinctly, an agent operating under this utility function with total utilitarianism that was running an extremely large civilization would have no qualms throwing marginal lives away to mitigate remotely unlikely civilizational threats. We could replace the Pascal’s Mugging scenario with a Pascal’s Murder scenario: I don’t like Joe, so I go tell the AI running things that Joe has an un-foilable plan to infect the extremely well-defended interstellar internet with a virus that will bring humanity back to the stone ages (and that furthermore, Joe’s hacking skills and defenses are so comprehensive that the only way to deal with Joe is to kill him immediately, as opposed to doing further investigation or taking a humane measure like putting him in prison). Joe’s life is such an unappreciable dent in the AI’s utility function compared to the loss of all civilization that the AI complies with my request. Boom—everyone in society has a button that lets them kill arbitrary people instantly.
It’s possible that you could (e.g.) ditch total utilitarianism and construct your utility function in some other clever way to avoid this problem. I’m just trying to demonstrate that it’s not obviously a bulletproof solution to the problem.
It seems to me that any reasoning, be it with bounded or unbounded utility will support avoiding unlikely civilizational threats at the expense of small number of lives for sufficiently large civilizations. I don’t see anything wrong with that (in particular I don’t think it leads to mass murder since that would have a significant utility cost).
There is a different related problem, namely that if the utility function saturates around (say) 10^10 people and our civilization has 10^20 people, then the death of everyone except some 10^15 people will be acceptable to prevent a an event killing everyone except some 10^8 people with much lower probability. However, this effect disappears once we sum over all possible universes weighted by the Solomonoff measure as we should (like done here). Effectively it normalizes the utility function to saturate at the actual capacity of the multiverse.
And the utility function doesn’t have to be bounded by a constant. An agent will “blow out its speakers” if it follows a utility function whose dynamic range is greater than the agent’s Kolmogorov complexity + the evidence the agent has accumulated in its lifetime. The agent’s brain’s subjective probabilities will not have sufficient fidelity for such a dynamic utility function to be meaningful.
Super-exponential utility values are OK, if you’ve accumulated a super-polynomial amount of evidence.
The Solomonoff expectation value of any unbounded computable utility function diverges. This is because the program “produce the first universe with utility > n^2” is roughly of length log n + O(1) therefore it contributes 2^{-log n + O(1)} n^2 = O(n) to the expectation value.
Oops, that’s not quite right. But I think that something like that is right :-). 15 Quirrell points to whoever formalizes it correctly first.
Because utility functions are not bounded. Without using infinities, could you describe an outcome so bad that it could never get worse?
Doesn’t mean my utility function is unbounded; it might have a finite infimum but never attain it.
You are thinking in terms of numbers (never gets worse than −1000), when what matters is outcomes. Your finite infimum would have to represent some event that you could describe or else it would have no meaning in ordinal utility terms. (At least I think this is how it works.)
Why? Suppose my utility function is 1/(number of staples in the universe). The infimum of my utility function would be for there to be infinitely many staples in the universe, which I cannot describe without using infinities.
I think you are correct. Good example.
You don’t need infinities for Pascal’s mugging to work.
With a bounded utility function U, you never pay the mugger X$ if the probability she is telling the truth is below X / (Umax—Umin).