I don’t know if this solves very much. As you say, if we use the number 1, then we shouldn’t wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal’s Muggings. But if we start going for less than one, then we’re just defining away Pascal’s Mugging by fiat, saying “this is the level at which I am willing to stop worrying about this”.
Also, as some people elsewhere in the comments have pointed out, this makes probability non-additive in an awkward sort of way. Suppose that if you eat unhealthy, you increase your risk of one million different diseases by plus one-in-a-million chance of getting each. Suppose also that eating healthy is a mildly unpleasant sacrifice, but getting a disease is much worse. If we calculate this out disease-by-disease, each disease is a Pascal’s Mugging and we should choose to eat unhealthy. But if we calculate this out in the broad category of “getting some disease or other”, then our chances are quite high and we should eat healthy. But it’s very strange that our ontology/categorization scheme should affect our decision-making. This becomes much more dangerous when we start talking about AIs.
Also, does this create weird nonlinear thresholds? For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence. But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.
Also, a world where people follow this plan is a world where I make a killing on the Inverse Lottery (rules: 10,000 people take tickets; each ticket holder gets paid $1, except a randomly chosen “winner” who must pay $20,000)
suppose that you live on average 80 years. [...] 80.01 years [...] 79.99 years
I started writing a completely wrong response to this, and it seems worth mentioning for the benefit of anyone else whose brain has the same bugs as mine.
I was going to propose replacing “Compute how many times you expect this to happen to you; treat it as normal if n>=1 and ignore completely if n= 1, ignore completely if n<0.01, and interpolate smoothly between those for intermediate n”.
But all this does, if the (dis)utility of the event is hugely greater than that of everything else you care about, is to push the threshold where you jump between “ignore everything except this” and “ignore this” further out, maybe from “once per 80 years” to “once per 8000 years”.
I fear we really do need something like bounded utility to make that problem go away.
I fear we really do need something like bounded utility to make that problem go away.
If what you dislike is a discontinuity, you still get a discontinuity at the bound.
I am not a utilitarian, but I would look for a way to deal with the issue at the meta level. Why would you believe the bet that Pascal’s Mugger offers you?
At a more prosaic level (e.g. seat belts) this looks to be a simple matter of risk tolerance and not that much of a problem.
What do you mean by “you still get a discontinuity at the bound”? (I am wondering whether by “bounded utility” you mean something like “unbounded utility followed by clipping at some fixed bounds”, which would certainly introduce weird discontinuities but isn’t at all what I have in mind when I imagine an agent with bounded utilities.)
I agree that doubting the mugger is a good idea, and in particular I think it’s entirely reasonable to suppose that the probability that anyone can affect your utility by an amount U must decrease at least as fast as 1/U for large U, which is essentially (except that I was assuming a Solomonoff-like probability assignment) what I proposed on LW back in 2007.
Now, of course an agent’s probability and utility assignments are whatever they are. Is there some reason other than wanting to avoid a Pascal’s mugging why that condition should hold? Well, if it doesn’t hold then your expected utility diverges, which seems fairly bad. -- Though I seem to recall seeing an argument from Stuart Armstrong or someone of the sort to the effect that if your utilities aren’t bounded then your expected utility in some situations pretty much has to diverge anyway.
(We can’t hope for a much stronger reason, I think. In particular, your utilities can be just about anything, so there’s clearly no outright impossibility or inconsistency about having utilities that “increase too fast” relative to your probability assignments.)
Crudely, something like this: Divide the value-laden world up into individuals and into short time-slices. Rate the happiness of each individual in each time-slice on a scale from −1 to +1. (So we suppose there are limits to momentary intensity of satisfaction or dissatisfaction, which seems reasonable to me.) Now, let h be a tiny positive number, and assign overall utility 1/h tanh(sum of atanh(h * local utility)).
Because h is extremely small, for modest lifespans and numbers of agents this is very close to net utility = sum of local utility. But we never get |overall utility| > 1/h.
Now, this feels like an ad hoc trick rather than a principled description of how we “should” value things, and I am not seriously proposing it as what anyone’s utility function “should” (or does) look like. But I think one could make something of a case for an agent that works more like this: just add up local utilities linearly, but weight the utility for agent A in timeslice T in a way that decreases exponentially (with small constant) with the description-length of (A,T), where the way in which we describe things is inevitably somewhat referenced to ourselves. So it’s pretty easy to say “me, now” and not that much harder to say “my wife, an hour from now”, so these two are weighted similarly; but you need a longer description to specify a person and time much further away, and if you have 3^^^3 people then you’re going to get weights about as small as 1/3^^^3.
You have a matrix of (number of individuals) x (number of time-slices). Each matrix cell has value (“happiness”) that’s constrained to lie in the [-1..1] interval. You call the cell value “local utility”, right?
And then you, basically, sum up the cell values, re-scale the sum to fit into a pre-defined range and, in the process, add a transformation that makes sure the bounds are not sharp cut-offs, but rather limits which you approach asymptotically.
As to the second part, I have trouble visualising the language in which the description-length would work as you want. It seems to me it will have to involve a lot scaffolding which might collapse under its own weight.
“You have a matrix …”: correct. “And then …”: whether that’s correct depends on what you mean by “in the process”, but it’s certainly not entirely unlike what I meant :-).
Your last paragraph is too metaphorical for me to work out whether I share your concerns. (My description was extremely handwavy so I’m in no position to complain.) I think the scaffolding required is basically just the agent’s knowledge. (To clarify a couple of points: not necessarily minimum description length, which of course is uncomputable, but something like “shortest description the agent can readily come up with”; and of course in practice what I describe is way too onerous computationally but some crude approximation might be manageable.)
The basic issue is whether the utility weights (“description lengths”) reflect the subjective preferences. If they do, it’s an entirely different kettle of fish. If they don’t, I don’t see why “my wife” should get much more weight than “the girl next to me on a bus”.
I think real people have preferences whose weights decay with distance—geographical, temporal and conceptual. I think it would be reasonable for artificial agents to do likewise. Whether the particular mode of decay I describe resembles real people’s, or would make an artificial agent tend to behave in ways we’d want, I don’t know. As I’ve already indicated, I’m not claiming to be doing more than sketch what some kinda-plausible bounded-utility agents might look like.
One easy way to do this is to map an unbounded utility function onto a finite interval. You will end up with the same order of preferences, but your choices won’t always be the same. In particular you will start avoiding cases of the mugging.
In particular you will start avoiding cases of the mugging.
Not really avoiding—a bound on your utility in the context of a Pascal’s Mugging is basically a bound on what the Mugger can offer you. For any probability of what the Mugger promises there is some non-zero amount that you would be willing to pay and that amount is a function of your bound (and of the probability, of course).
However utility asymptotically approaching a bound is likely to have its own set of problems. Here is a scenario after five seconds of thinking:
That vexatious chap Omega approaches you (again!) and this time instead of boxes offers you two buttons, let’s say one of them is teal-coloured and the other is cyan-coloured. He says that if you press the teal button, 1,000,001 people will be cured of terminal cancer. But if you press the cyan button, 1,000,000 people will be cured of terminal cancer plus he’ll give you a dollar. You consult your utility function, happily press the cyan button and walk away richer by a dollar. Did something go wrong?
I suggested mapping an unbounded utility function onto a finite interval. This preserves the order of the preferences in the unbounded utility function.
In my “unbounded” function, I prefer saving 1,000,001 people to saving 1,000,000 and getting a dollar. So I have the same preference with the bounded function, and so I press the teal button.
If you want to do all operations—notably, adding utility and dollars—before mapping to the finite interval, you still fall prey to the Pascal’s Mugging and I don’t see the point of the mapping at all in this case.
Saving 1,000,000 lives = 10,000,000,000,000 utility. Saving 1,000,001 lives = 10,000,010,000,000 utility. Getting a dollar = 1 utility. Saving 1,000,000 lives and getting a dollar = 10,000,000,000,001 utility.
Here we have getting a dollar < saving 1,000,000 lives < saving 1,000,000 lives and getting a dollar < saving 1,000,001 lives.
The mapping is a one-to-one function that maps values between negative and positive infinity to a finite interval, and preserves the order of the values. There are a lot of ways to do this, and it will mean that the utility of saving 1,000,001 lives will remain higher than the utility of saving 1,000,000 lives and getting a dollar.
But it preserves this order, not everything else, and so it can still avoid Pascal’s Mugging. Basically the mugging depends on multiplying the utility by a probability. But since the utility has a numerical bound, that means when the probability gets too low, this multiplied value will tend toward zero. This does mean that my system can give different results when betting is involved. But that’s what we wanted, anyway.
As you say, if we use the number 1, then we shouldn’t wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal’s Muggings.
And in fact, it has taken lots of pushing to make all of those things common enough that we can no longer say that no one does them. (In fact, looking back at hte 90s and early 2000s, it feels like wearing one’s seatbelt at all times was pretty contrarian, where I live. This is only changing thanks to intense advertising campaigns.)
if we use the number 1, then we shouldn’t wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal’s Muggings
Isn’t this dealt with in the above by aggregating all the deals of a certain probability together?
(amount of deals that you can make in your life that have this probability) * (PEST) < 1
Maybe the expected number of major car crashes or dangerous fires, etc that you experience are each less than 1, but the expectation for the number of all such things that happen to you might be greater than 1.
There might be issues with how to group such events though, since only considering things with the exact same probability together doesn’t make sense.
For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence.
That only applies if you’re going to live exactly 80 years. If your lifespan is some distribution which is centered around 80 years, you should gradually stop caring as the frequency of the event goes up past 80 years, the amount by which you’ve stopped caring depending on the distribution. It doesn’t go all the way to zero until the chance that you’ll live that long is zero.
(Of course, you could reply that your chance of living to some age doesn’t go to exactly zero, but all that is necessary to prevent the mugging is that it goes down fast enough.)
As you say, if we use the number 1, then we shouldn’t wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal’s Muggings. But if we start going for less than one, then we’re just defining away Pascal’s Mugging by fiat, saying “this is the level at which I am willing to stop worrying about this”.
The point of Pascal’s mugging is things that have basically infinitely small probability. Things that will never happen, ever, ever, once in 3^^^3 universes and possibly much more. People do get in car accidents and get cancer all the time. You shouldn’t ignore those probabilities.
Having a policy of heeding small risks like those is fine. Over the course of your life, they add up. There will be a large chance that you will be better off than not.
But having a policy of paying the mugger, of following expected utility in extreme cases, will never ever pay off. You will always be worse off than you otherwise would be.
So in that sense it isn’t arbitrary. There is an actual number where ignoring risks below that threshold gives you the best median outcome. Following expected utility above that threshold works out for the best. Following EU on risks below the threshold is more likely to make you worse off.
If you knew your full probability distribution of possible outcomes, you could exactly calculate that number.
I don’t know if this solves very much. As you say, if we use the number 1, then we shouldn’t wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal’s Muggings. But if we start going for less than one, then we’re just defining away Pascal’s Mugging by fiat, saying “this is the level at which I am willing to stop worrying about this”.
Also, as some people elsewhere in the comments have pointed out, this makes probability non-additive in an awkward sort of way. Suppose that if you eat unhealthy, you increase your risk of one million different diseases by plus one-in-a-million chance of getting each. Suppose also that eating healthy is a mildly unpleasant sacrifice, but getting a disease is much worse. If we calculate this out disease-by-disease, each disease is a Pascal’s Mugging and we should choose to eat unhealthy. But if we calculate this out in the broad category of “getting some disease or other”, then our chances are quite high and we should eat healthy. But it’s very strange that our ontology/categorization scheme should affect our decision-making. This becomes much more dangerous when we start talking about AIs.
Also, does this create weird nonlinear thresholds? For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence. But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.
Also, a world where people follow this plan is a world where I make a killing on the Inverse Lottery (rules: 10,000 people take tickets; each ticket holder gets paid $1, except a randomly chosen “winner” who must pay $20,000)
How confident are you that this is false?
I started writing a completely wrong response to this, and it seems worth mentioning for the benefit of anyone else whose brain has the same bugs as mine.
I was going to propose replacing “Compute how many times you expect this to happen to you; treat it as normal if n>=1 and ignore completely if n= 1, ignore completely if n<0.01, and interpolate smoothly between those for intermediate n”.
But all this does, if the (dis)utility of the event is hugely greater than that of everything else you care about, is to push the threshold where you jump between “ignore everything except this” and “ignore this” further out, maybe from “once per 80 years” to “once per 8000 years”.
I fear we really do need something like bounded utility to make that problem go away.
If what you dislike is a discontinuity, you still get a discontinuity at the bound.
I am not a utilitarian, but I would look for a way to deal with the issue at the meta level. Why would you believe the bet that Pascal’s Mugger offers you?
At a more prosaic level (e.g. seat belts) this looks to be a simple matter of risk tolerance and not that much of a problem.
What do you mean by “you still get a discontinuity at the bound”? (I am wondering whether by “bounded utility” you mean something like “unbounded utility followed by clipping at some fixed bounds”, which would certainly introduce weird discontinuities but isn’t at all what I have in mind when I imagine an agent with bounded utilities.)
I agree that doubting the mugger is a good idea, and in particular I think it’s entirely reasonable to suppose that the probability that anyone can affect your utility by an amount U must decrease at least as fast as 1/U for large U, which is essentially (except that I was assuming a Solomonoff-like probability assignment) what I proposed on LW back in 2007.
Now, of course an agent’s probability and utility assignments are whatever they are. Is there some reason other than wanting to avoid a Pascal’s mugging why that condition should hold? Well, if it doesn’t hold then your expected utility diverges, which seems fairly bad. -- Though I seem to recall seeing an argument from Stuart Armstrong or someone of the sort to the effect that if your utilities aren’t bounded then your expected utility in some situations pretty much has to diverge anyway.
(We can’t hope for a much stronger reason, I think. In particular, your utilities can be just about anything, so there’s clearly no outright impossibility or inconsistency about having utilities that “increase too fast” relative to your probability assignments.)
What do you have in mind?
Crudely, something like this: Divide the value-laden world up into individuals and into short time-slices. Rate the happiness of each individual in each time-slice on a scale from −1 to +1. (So we suppose there are limits to momentary intensity of satisfaction or dissatisfaction, which seems reasonable to me.) Now, let h be a tiny positive number, and assign overall utility 1/h tanh(sum of atanh(h * local utility)).
Because h is extremely small, for modest lifespans and numbers of agents this is very close to net utility = sum of local utility. But we never get |overall utility| > 1/h.
Now, this feels like an ad hoc trick rather than a principled description of how we “should” value things, and I am not seriously proposing it as what anyone’s utility function “should” (or does) look like. But I think one could make something of a case for an agent that works more like this: just add up local utilities linearly, but weight the utility for agent A in timeslice T in a way that decreases exponentially (with small constant) with the description-length of (A,T), where the way in which we describe things is inevitably somewhat referenced to ourselves. So it’s pretty easy to say “me, now” and not that much harder to say “my wife, an hour from now”, so these two are weighted similarly; but you need a longer description to specify a person and time much further away, and if you have 3^^^3 people then you’re going to get weights about as small as 1/3^^^3.
Let me see if I understand you correctly.
You have a matrix of (number of individuals) x (number of time-slices). Each matrix cell has value (“happiness”) that’s constrained to lie in the [-1..1] interval. You call the cell value “local utility”, right?
And then you, basically, sum up the cell values, re-scale the sum to fit into a pre-defined range and, in the process, add a transformation that makes sure the bounds are not sharp cut-offs, but rather limits which you approach asymptotically.
As to the second part, I have trouble visualising the language in which the description-length would work as you want. It seems to me it will have to involve a lot scaffolding which might collapse under its own weight.
“You have a matrix …”: correct. “And then …”: whether that’s correct depends on what you mean by “in the process”, but it’s certainly not entirely unlike what I meant :-).
Your last paragraph is too metaphorical for me to work out whether I share your concerns. (My description was extremely handwavy so I’m in no position to complain.) I think the scaffolding required is basically just the agent’s knowledge. (To clarify a couple of points: not necessarily minimum description length, which of course is uncomputable, but something like “shortest description the agent can readily come up with”; and of course in practice what I describe is way too onerous computationally but some crude approximation might be manageable.)
The basic issue is whether the utility weights (“description lengths”) reflect the subjective preferences. If they do, it’s an entirely different kettle of fish. If they don’t, I don’t see why “my wife” should get much more weight than “the girl next to me on a bus”.
I think real people have preferences whose weights decay with distance—geographical, temporal and conceptual. I think it would be reasonable for artificial agents to do likewise. Whether the particular mode of decay I describe resembles real people’s, or would make an artificial agent tend to behave in ways we’d want, I don’t know. As I’ve already indicated, I’m not claiming to be doing more than sketch what some kinda-plausible bounded-utility agents might look like.
One easy way to do this is to map an unbounded utility function onto a finite interval. You will end up with the same order of preferences, but your choices won’t always be the same. In particular you will start avoiding cases of the mugging.
Not really avoiding—a bound on your utility in the context of a Pascal’s Mugging is basically a bound on what the Mugger can offer you. For any probability of what the Mugger promises there is some non-zero amount that you would be willing to pay and that amount is a function of your bound (and of the probability, of course).
However utility asymptotically approaching a bound is likely to have its own set of problems. Here is a scenario after five seconds of thinking:
That vexatious chap Omega approaches you (again!) and this time instead of boxes offers you two buttons, let’s say one of them is teal-coloured and the other is cyan-coloured. He says that if you press the teal button, 1,000,001 people will be cured of terminal cancer. But if you press the cyan button, 1,000,000 people will be cured of terminal cancer plus he’ll give you a dollar. You consult your utility function, happily press the cyan button and walk away richer by a dollar. Did something go wrong?
Yes, something went wrong in your analysis.
I suggested mapping an unbounded utility function onto a finite interval. This preserves the order of the preferences in the unbounded utility function.
In my “unbounded” function, I prefer saving 1,000,001 people to saving 1,000,000 and getting a dollar. So I have the same preference with the bounded function, and so I press the teal button.
If you want to do all operations—notably, adding utility and dollars—before mapping to the finite interval, you still fall prey to the Pascal’s Mugging and I don’t see the point of the mapping at all in this case.
The mapping is of utility values, e.g.
In my unbounded function I might have:
Saving 1,000,000 lives = 10,000,000,000,000 utility. Saving 1,000,001 lives = 10,000,010,000,000 utility. Getting a dollar = 1 utility. Saving 1,000,000 lives and getting a dollar = 10,000,000,000,001 utility.
Here we have getting a dollar < saving 1,000,000 lives < saving 1,000,000 lives and getting a dollar < saving 1,000,001 lives.
The mapping is a one-to-one function that maps values between negative and positive infinity to a finite interval, and preserves the order of the values. There are a lot of ways to do this, and it will mean that the utility of saving 1,000,001 lives will remain higher than the utility of saving 1,000,000 lives and getting a dollar.
But it preserves this order, not everything else, and so it can still avoid Pascal’s Mugging. Basically the mugging depends on multiplying the utility by a probability. But since the utility has a numerical bound, that means when the probability gets too low, this multiplied value will tend toward zero. This does mean that my system can give different results when betting is involved. But that’s what we wanted, anyway.
Oh, I see. So you do all the operations on the unbounded utility, but calculate the expected value of the bounded version.
And in fact, it has taken lots of pushing to make all of those things common enough that we can no longer say that no one does them. (In fact, looking back at hte 90s and early 2000s, it feels like wearing one’s seatbelt at all times was pretty contrarian, where I live. This is only changing thanks to intense advertising campaigns.)
Isn’t this dealt with in the above by aggregating all the deals of a certain probability together?
Maybe the expected number of major car crashes or dangerous fires, etc that you experience are each less than 1, but the expectation for the number of all such things that happen to you might be greater than 1.
There might be issues with how to group such events though, since only considering things with the exact same probability together doesn’t make sense.
Doesn’t it actually make sense to put that threshold at the predicted usable lifespan of the universe?
That only applies if you’re going to live exactly 80 years. If your lifespan is some distribution which is centered around 80 years, you should gradually stop caring as the frequency of the event goes up past 80 years, the amount by which you’ve stopped caring depending on the distribution. It doesn’t go all the way to zero until the chance that you’ll live that long is zero.
(Of course, you could reply that your chance of living to some age doesn’t go to exactly zero, but all that is necessary to prevent the mugging is that it goes down fast enough.)
What are the odds that our ideas about eating healthy are wrong?
The point of Pascal’s mugging is things that have basically infinitely small probability. Things that will never happen, ever, ever, once in 3^^^3 universes and possibly much more. People do get in car accidents and get cancer all the time. You shouldn’t ignore those probabilities.
Having a policy of heeding small risks like those is fine. Over the course of your life, they add up. There will be a large chance that you will be better off than not.
But having a policy of paying the mugger, of following expected utility in extreme cases, will never ever pay off. You will always be worse off than you otherwise would be.
So in that sense it isn’t arbitrary. There is an actual number where ignoring risks below that threshold gives you the best median outcome. Following expected utility above that threshold works out for the best. Following EU on risks below the threshold is more likely to make you worse off.
If you knew your full probability distribution of possible outcomes, you could exactly calculate that number.