What do you mean by “you still get a discontinuity at the bound”? (I am wondering whether by “bounded utility” you mean something like “unbounded utility followed by clipping at some fixed bounds”, which would certainly introduce weird discontinuities but isn’t at all what I have in mind when I imagine an agent with bounded utilities.)
I agree that doubting the mugger is a good idea, and in particular I think it’s entirely reasonable to suppose that the probability that anyone can affect your utility by an amount U must decrease at least as fast as 1/U for large U, which is essentially (except that I was assuming a Solomonoff-like probability assignment) what I proposed on LW back in 2007.
Now, of course an agent’s probability and utility assignments are whatever they are. Is there some reason other than wanting to avoid a Pascal’s mugging why that condition should hold? Well, if it doesn’t hold then your expected utility diverges, which seems fairly bad. -- Though I seem to recall seeing an argument from Stuart Armstrong or someone of the sort to the effect that if your utilities aren’t bounded then your expected utility in some situations pretty much has to diverge anyway.
(We can’t hope for a much stronger reason, I think. In particular, your utilities can be just about anything, so there’s clearly no outright impossibility or inconsistency about having utilities that “increase too fast” relative to your probability assignments.)
Crudely, something like this: Divide the value-laden world up into individuals and into short time-slices. Rate the happiness of each individual in each time-slice on a scale from −1 to +1. (So we suppose there are limits to momentary intensity of satisfaction or dissatisfaction, which seems reasonable to me.) Now, let h be a tiny positive number, and assign overall utility 1/h tanh(sum of atanh(h * local utility)).
Because h is extremely small, for modest lifespans and numbers of agents this is very close to net utility = sum of local utility. But we never get |overall utility| > 1/h.
Now, this feels like an ad hoc trick rather than a principled description of how we “should” value things, and I am not seriously proposing it as what anyone’s utility function “should” (or does) look like. But I think one could make something of a case for an agent that works more like this: just add up local utilities linearly, but weight the utility for agent A in timeslice T in a way that decreases exponentially (with small constant) with the description-length of (A,T), where the way in which we describe things is inevitably somewhat referenced to ourselves. So it’s pretty easy to say “me, now” and not that much harder to say “my wife, an hour from now”, so these two are weighted similarly; but you need a longer description to specify a person and time much further away, and if you have 3^^^3 people then you’re going to get weights about as small as 1/3^^^3.
You have a matrix of (number of individuals) x (number of time-slices). Each matrix cell has value (“happiness”) that’s constrained to lie in the [-1..1] interval. You call the cell value “local utility”, right?
And then you, basically, sum up the cell values, re-scale the sum to fit into a pre-defined range and, in the process, add a transformation that makes sure the bounds are not sharp cut-offs, but rather limits which you approach asymptotically.
As to the second part, I have trouble visualising the language in which the description-length would work as you want. It seems to me it will have to involve a lot scaffolding which might collapse under its own weight.
“You have a matrix …”: correct. “And then …”: whether that’s correct depends on what you mean by “in the process”, but it’s certainly not entirely unlike what I meant :-).
Your last paragraph is too metaphorical for me to work out whether I share your concerns. (My description was extremely handwavy so I’m in no position to complain.) I think the scaffolding required is basically just the agent’s knowledge. (To clarify a couple of points: not necessarily minimum description length, which of course is uncomputable, but something like “shortest description the agent can readily come up with”; and of course in practice what I describe is way too onerous computationally but some crude approximation might be manageable.)
The basic issue is whether the utility weights (“description lengths”) reflect the subjective preferences. If they do, it’s an entirely different kettle of fish. If they don’t, I don’t see why “my wife” should get much more weight than “the girl next to me on a bus”.
I think real people have preferences whose weights decay with distance—geographical, temporal and conceptual. I think it would be reasonable for artificial agents to do likewise. Whether the particular mode of decay I describe resembles real people’s, or would make an artificial agent tend to behave in ways we’d want, I don’t know. As I’ve already indicated, I’m not claiming to be doing more than sketch what some kinda-plausible bounded-utility agents might look like.
One easy way to do this is to map an unbounded utility function onto a finite interval. You will end up with the same order of preferences, but your choices won’t always be the same. In particular you will start avoiding cases of the mugging.
In particular you will start avoiding cases of the mugging.
Not really avoiding—a bound on your utility in the context of a Pascal’s Mugging is basically a bound on what the Mugger can offer you. For any probability of what the Mugger promises there is some non-zero amount that you would be willing to pay and that amount is a function of your bound (and of the probability, of course).
However utility asymptotically approaching a bound is likely to have its own set of problems. Here is a scenario after five seconds of thinking:
That vexatious chap Omega approaches you (again!) and this time instead of boxes offers you two buttons, let’s say one of them is teal-coloured and the other is cyan-coloured. He says that if you press the teal button, 1,000,001 people will be cured of terminal cancer. But if you press the cyan button, 1,000,000 people will be cured of terminal cancer plus he’ll give you a dollar. You consult your utility function, happily press the cyan button and walk away richer by a dollar. Did something go wrong?
I suggested mapping an unbounded utility function onto a finite interval. This preserves the order of the preferences in the unbounded utility function.
In my “unbounded” function, I prefer saving 1,000,001 people to saving 1,000,000 and getting a dollar. So I have the same preference with the bounded function, and so I press the teal button.
If you want to do all operations—notably, adding utility and dollars—before mapping to the finite interval, you still fall prey to the Pascal’s Mugging and I don’t see the point of the mapping at all in this case.
Saving 1,000,000 lives = 10,000,000,000,000 utility. Saving 1,000,001 lives = 10,000,010,000,000 utility. Getting a dollar = 1 utility. Saving 1,000,000 lives and getting a dollar = 10,000,000,000,001 utility.
Here we have getting a dollar < saving 1,000,000 lives < saving 1,000,000 lives and getting a dollar < saving 1,000,001 lives.
The mapping is a one-to-one function that maps values between negative and positive infinity to a finite interval, and preserves the order of the values. There are a lot of ways to do this, and it will mean that the utility of saving 1,000,001 lives will remain higher than the utility of saving 1,000,000 lives and getting a dollar.
But it preserves this order, not everything else, and so it can still avoid Pascal’s Mugging. Basically the mugging depends on multiplying the utility by a probability. But since the utility has a numerical bound, that means when the probability gets too low, this multiplied value will tend toward zero. This does mean that my system can give different results when betting is involved. But that’s what we wanted, anyway.
What do you mean by “you still get a discontinuity at the bound”? (I am wondering whether by “bounded utility” you mean something like “unbounded utility followed by clipping at some fixed bounds”, which would certainly introduce weird discontinuities but isn’t at all what I have in mind when I imagine an agent with bounded utilities.)
I agree that doubting the mugger is a good idea, and in particular I think it’s entirely reasonable to suppose that the probability that anyone can affect your utility by an amount U must decrease at least as fast as 1/U for large U, which is essentially (except that I was assuming a Solomonoff-like probability assignment) what I proposed on LW back in 2007.
Now, of course an agent’s probability and utility assignments are whatever they are. Is there some reason other than wanting to avoid a Pascal’s mugging why that condition should hold? Well, if it doesn’t hold then your expected utility diverges, which seems fairly bad. -- Though I seem to recall seeing an argument from Stuart Armstrong or someone of the sort to the effect that if your utilities aren’t bounded then your expected utility in some situations pretty much has to diverge anyway.
(We can’t hope for a much stronger reason, I think. In particular, your utilities can be just about anything, so there’s clearly no outright impossibility or inconsistency about having utilities that “increase too fast” relative to your probability assignments.)
What do you have in mind?
Crudely, something like this: Divide the value-laden world up into individuals and into short time-slices. Rate the happiness of each individual in each time-slice on a scale from −1 to +1. (So we suppose there are limits to momentary intensity of satisfaction or dissatisfaction, which seems reasonable to me.) Now, let h be a tiny positive number, and assign overall utility 1/h tanh(sum of atanh(h * local utility)).
Because h is extremely small, for modest lifespans and numbers of agents this is very close to net utility = sum of local utility. But we never get |overall utility| > 1/h.
Now, this feels like an ad hoc trick rather than a principled description of how we “should” value things, and I am not seriously proposing it as what anyone’s utility function “should” (or does) look like. But I think one could make something of a case for an agent that works more like this: just add up local utilities linearly, but weight the utility for agent A in timeslice T in a way that decreases exponentially (with small constant) with the description-length of (A,T), where the way in which we describe things is inevitably somewhat referenced to ourselves. So it’s pretty easy to say “me, now” and not that much harder to say “my wife, an hour from now”, so these two are weighted similarly; but you need a longer description to specify a person and time much further away, and if you have 3^^^3 people then you’re going to get weights about as small as 1/3^^^3.
Let me see if I understand you correctly.
You have a matrix of (number of individuals) x (number of time-slices). Each matrix cell has value (“happiness”) that’s constrained to lie in the [-1..1] interval. You call the cell value “local utility”, right?
And then you, basically, sum up the cell values, re-scale the sum to fit into a pre-defined range and, in the process, add a transformation that makes sure the bounds are not sharp cut-offs, but rather limits which you approach asymptotically.
As to the second part, I have trouble visualising the language in which the description-length would work as you want. It seems to me it will have to involve a lot scaffolding which might collapse under its own weight.
“You have a matrix …”: correct. “And then …”: whether that’s correct depends on what you mean by “in the process”, but it’s certainly not entirely unlike what I meant :-).
Your last paragraph is too metaphorical for me to work out whether I share your concerns. (My description was extremely handwavy so I’m in no position to complain.) I think the scaffolding required is basically just the agent’s knowledge. (To clarify a couple of points: not necessarily minimum description length, which of course is uncomputable, but something like “shortest description the agent can readily come up with”; and of course in practice what I describe is way too onerous computationally but some crude approximation might be manageable.)
The basic issue is whether the utility weights (“description lengths”) reflect the subjective preferences. If they do, it’s an entirely different kettle of fish. If they don’t, I don’t see why “my wife” should get much more weight than “the girl next to me on a bus”.
I think real people have preferences whose weights decay with distance—geographical, temporal and conceptual. I think it would be reasonable for artificial agents to do likewise. Whether the particular mode of decay I describe resembles real people’s, or would make an artificial agent tend to behave in ways we’d want, I don’t know. As I’ve already indicated, I’m not claiming to be doing more than sketch what some kinda-plausible bounded-utility agents might look like.
One easy way to do this is to map an unbounded utility function onto a finite interval. You will end up with the same order of preferences, but your choices won’t always be the same. In particular you will start avoiding cases of the mugging.
Not really avoiding—a bound on your utility in the context of a Pascal’s Mugging is basically a bound on what the Mugger can offer you. For any probability of what the Mugger promises there is some non-zero amount that you would be willing to pay and that amount is a function of your bound (and of the probability, of course).
However utility asymptotically approaching a bound is likely to have its own set of problems. Here is a scenario after five seconds of thinking:
That vexatious chap Omega approaches you (again!) and this time instead of boxes offers you two buttons, let’s say one of them is teal-coloured and the other is cyan-coloured. He says that if you press the teal button, 1,000,001 people will be cured of terminal cancer. But if you press the cyan button, 1,000,000 people will be cured of terminal cancer plus he’ll give you a dollar. You consult your utility function, happily press the cyan button and walk away richer by a dollar. Did something go wrong?
Yes, something went wrong in your analysis.
I suggested mapping an unbounded utility function onto a finite interval. This preserves the order of the preferences in the unbounded utility function.
In my “unbounded” function, I prefer saving 1,000,001 people to saving 1,000,000 and getting a dollar. So I have the same preference with the bounded function, and so I press the teal button.
If you want to do all operations—notably, adding utility and dollars—before mapping to the finite interval, you still fall prey to the Pascal’s Mugging and I don’t see the point of the mapping at all in this case.
The mapping is of utility values, e.g.
In my unbounded function I might have:
Saving 1,000,000 lives = 10,000,000,000,000 utility. Saving 1,000,001 lives = 10,000,010,000,000 utility. Getting a dollar = 1 utility. Saving 1,000,000 lives and getting a dollar = 10,000,000,000,001 utility.
Here we have getting a dollar < saving 1,000,000 lives < saving 1,000,000 lives and getting a dollar < saving 1,000,001 lives.
The mapping is a one-to-one function that maps values between negative and positive infinity to a finite interval, and preserves the order of the values. There are a lot of ways to do this, and it will mean that the utility of saving 1,000,001 lives will remain higher than the utility of saving 1,000,000 lives and getting a dollar.
But it preserves this order, not everything else, and so it can still avoid Pascal’s Mugging. Basically the mugging depends on multiplying the utility by a probability. But since the utility has a numerical bound, that means when the probability gets too low, this multiplied value will tend toward zero. This does mean that my system can give different results when betting is involved. But that’s what we wanted, anyway.
Oh, I see. So you do all the operations on the unbounded utility, but calculate the expected value of the bounded version.