Papers framing anthropic questions as decision problems?
A few weeks ago at a Seattle LW meetup, we were discussing the Sleeping Beauty problem and the Doomsday argument. We talked about how framing Sleeping Beauty problem as a decision problem basically solves it and then got the idea of using same heuristic on the Doomsday problem. I think you would need to specify more about the Doomsday setup than is usually done to do this.
We didn’t spend a lot of time on it, but it got me thinking: Are there papers on trying to gain insight into the Doomsday problem and other anthropic reasoning problems by framing them as decision problems? I’m surprised I haven’t seen this approach talked about here before. The idea seems relatively simple, so perhaps there is some major problem that I’m not seeing.
Armstrong, Anthropic decision theory for self-locating beliefs.
Stuart’s anthropic decision theory uses utility-counting rather than probabilities, so it does not satisfy the axioms for a utility maximizer. So there’s a big difference between your typical decision problem, where you want to maximize utility, and the decision of anthropic decision theory, which does not.
Is there an elaborated critique of this paper/idea somewhere?
I don’t think so. But I can give you the critique here :)
SUMMARY: If you don’t actually calculate expected utility, don’t expect to automatically make choices that correspond to a relevant utility function. Also, don’t name a specific function for a general feeling—you might accidentally start calling the function when you just mean the feeling, and then you get really wrong answers.
-
I’ve already made the obvious criticism—since Stuart’s anthropic decision theory is basically a way to avoid using subjective probabilities (also known as “probabilities”) when they’re confusing, it makes sense that it usually can’t do probability-associated things like maximizing the expectation of a specific utility function.
The replacement for actually figuring out the probabilities is an “anthropic preference” that takes a list of utilities of different people you could be inside a “world,” then outputs some effective utility for that world. That is, it’s some function A(U(1), U(2), U(3), … ), where U is the utility of being a certain person. The “worlds” are the different outcomes that you would know the probabilities for if no confusing anthropic things happened (for example, in a coin flip, the probabilities of the heads and tails worlds are 0.5). So the expected utility-ish-stuff is the sum over the worlds of P(world, if there was no anthropic stuff) * A(U(1), U(2), U(3), … ).
Expected utility, on the other hand, can be written as the sum over people you could be of P(being this person) U(being this person). So the only time when this anthropic decision theory actually maximizes utility is when its expected utility-ish-stuff is proportional to the actual expected utility—that is, you have to pick the correct anthropic preference out of all possible functions, and it can be a different* one for every different problem. The easiest way to do this is simply to know what the expected utility is beforehand, and in that case you might as well just maximize that :P
On the other hand, it’s not impossible to find a few useful “A”s, in the occasion that things are really simple. For example, if the different probabilities of being people are all a constant, multiplied by P(world, if there was no anthropic stuff), then the probability of being in a “world” is proportional to the number of people in that world, and the expected utility of being in a world is just the unweighted average of the utility. So the anthropic preference “A” just has to add everything together.
The Sleeping Beauty problem is an example that’s simple enough to meet these conditions. There are a few other simple situations that can also yield simple “A” functions. However, if you drift away from simplicity, for example by giving people weak evidence about which world they’re in, you can’t even always find an “A” that gives you back the expected utility.
Another way to get into trouble is if you use an “A” that you think corresponds to a specific expected utilt, but you have not ever proved this. This was my real beef with Stuart’s paper—he names his anthropic preferences with colorful descriptors like “altruistic” or “selfish.” This is fine as long as you keep in mind that these names correspond to the anthropic preferences, not to any sort of utility function; but it leads to folly when he goes on to say things like “if we have an altruistic agent,” as if he could tell how the agent acted in general. Because the “A” that corresponds to a utility function can change from problem to problem, calling an “A” function “selfish” can’t somehow force it to correspond to a utility function we’d call selfish.
So in retrospect calling these anthropic preferences things like “altruistic” was a pretty bad idea, because it generated confusion. This confusion results in him whiffing the Presumptuous Philosopher problem, calling on what “selfish” versus “altruistic” methods of adding utilities together would say, with no guarantee that they have anything to do with any relevant expected utility. In fact, at no point does he ever compare anything to expected utility, because, if you remember back 20 paragraphs ago ( :D ), the whole point of this decision process was to avoid confusing probabilities.
Many thanks for your comments; it’s nice to have someone engaging with it.
That said, I have to disagree with you! You’re right, the whole point was to avoid using “anthropic probabilities” (though not “subjective probabilities” in general; I may have misused the word “objective” in that context). But the terms “altruistic”, “selfish” and so on, do correspond to actual utility functions.
“Selfless” means your utility function has no hedonistic content, just an arbitrary utility function over world states that doesn’t care about your own identity. “Altruistic” means your utility function is composed of some amalgam of the hedonistic utilities of you and others. And “selfish” means your utility function is equal to your own hedonistic utility.
The full picture is more complex—as always—and in some contexts it would be best to say “non-indexical” for selfless and “indexical” for selfish. But be as it may, these are honest, actual utility functions that you are trying to maximise, not “A”‘s over utility functions. Some might be “A”’s over hedonistic utility functions, but they still are genuine utilities: I am only altruistic if I actually want other people to be happy (or achieve their goals or similar); their happiness is a term in my utility function.
Then ADT can be described (for the non-selfish agents) coloquially as “before the universe was created, if i wanted to maximise U, what decision theory would I want any U-maximiser to follow? (ie what decision theory maximises the expected U in this context)”. So ADT is doubly a utility maximising theory: first pick the utility maximising decision theory, then the agents with it will try and maximise utility in accordance with the theory they have.
(for selfish/indexical agents it’s a bit more tricky and I have to use symmetry or “veil of ignorance” arguments; we can get back to that).
Furthermore, ADT can perfectly deal with any other type of uncertainty—such as whether you are or aren’t in a Sleeping Beauty problem, or when you have partial evidence that you’re Monday, or whatever. There’s no need for it to restrict to the simple cases. Admittedly for the presumptuous philosopher, I restricted to a simple, with simple binary altruistic/selfish utilites, but that was for illustrative purposes. Come up with a more complex problem, with more complicated utilties, and ADT will give a correspondingly more complex answer.
Okay, let’s look at the “selfish” anthropic preference laid out in your paper, in two different problems.
In both of these problems there are two worlds, “H” and “T,” which have equal “no anthropics” probabilities of 0.5. There are two people you could be in T and one person you could be in H. Standard Sleeping Beauty so far.
However, because I like comparing things to utility, I’m going to specify two sets of probabilities. In Problem 1, the probability of being each person is 1⁄3. In Problem 2, the probability of being the person in H is 1⁄2, while the probability of being a person in T is 1⁄4 each.
( These probabilities can be manipulated by giving people evidence of which world they are in—for example, you could spontaneously stop some H or T sessions of the experiment, so the people in the experiment can condition on the experiment not being stopped. The point is that both problems are entirely possible. )
And let’s have winning each bet give 1 hedonic utilon (you get to eat a candybar).
ADT makes identical choices in both problems, because it just interacts with the “if there were no anthropics” probabilities. The selfish preference just says to give each world the average utility of all its inhabitants. To calculate the expected-utility-ish thing for betting on Tails, we take the average bet won in the winning side (1 candybar) and multiply it by the probability of the winning side if there were no anthropics (0.5). So our “selfish” ADT agent pays up to 0.5 candybars for a bet on Tails in both problems.
Now, what are the expected hedonic utilities? (in units of candybars, of course)
In problem 1, the probability of being a winner is 2⁄3, so a utility maximizer pays up to 2⁄3 of a candybar to bet on Tails.
In problem 2, the probability of being a winner is 1⁄2, so a utility maximizer pays up to 1⁄2 of a candybar to bet on Tails.
So in problem 2, the “selfish” ADT agent and the utility maximizer do the same thing. This looks like a good example of selfishness. But what’s going on in problem 1? Even though the utilon is entirely candybar-based, the “selfish” anthropic preference seems to undervalue it. What anthropic preference would maximize expected hedonic utility?
Well, if you added up all the utilities in each world, rather than averaging, then an ADT agent would do the same thing as a utility maximizer in problem 1. But now in problem 2, this “total utility” ADT agent would overestimate the value of a candybar, relative to maximum expected utility.
There are in fact no ADT preferences that maximize candybars in both problems. There is no analogue of utility maximization, which makes sense because ADT doesn’t deal with the subjective probabilities and expected utility does.
To translate this into ADT terms: in problem 2, the coin is fair, in problem 1, the coin is (1/3, 2⁄3) on (H, T) (or maybe the coin was fair, but we got extra info that pushed the postiori odds to (1/3, 2⁄3)).
Then ADT (and SSA) says that selfish agents should bet up to 2⁄3 of candybar on Tails in problem 1, and 1⁄2 in problem 2. Exactly the same as what you were saying. I don’t understand why you think that ADT would make identical choices in both problems.
The reason that’s “exactly as I was saying” is because you adjusted a free parameter to fit the problem, after you learned the subjective probabilities. The free parameter was which world to regard as “normal” and which one to apply a correction to. If you already know that the (1/2, 1⁄4, 1⁄4) problem is the “normal” one, then you already solved the probability problem and should just maximize expected utility.
Er no—you gave me an underspecified problem. You told me the agents were selfish (good), but then just gave me anthropic probabilities, without giving me the non-anthropic probabilities. I assumed you were meaning to use SSA, and worked back from there. This may have been incorrect—were you assuming SIA? In that case the coin odds are (1/2,1/2) and (2/3,1/3), and ADT would reach different conclusions. But only because the problem was underspecified (giving anthropic probabilities without explaining the theory that goes with them is not specifying the problem).
As long as you give a full specification of the problem, ADT doesn’t have an issue. You don’t need to adjust free parameters or anything.
I feel like I’m missing something here. Can you explain the hole in ADT you seem to find so glaring?
I intended “In both of these problems there are two worlds, “H” and “T,” which have equal “no anthropics” probabilities of 0.5. ”
In retrospect, my example of evidence (stopping some of the experiments) wasn’t actually what I wanted, since an outside observer would notice it. In order to mess with anthropic probabilities in isolation you’d need to change the structure of coinflips and people-creation.
But you can’t mess with the probabilities in isolation. Suppose I were an SIA agent, for instance; then you can’t change my anthropic probabilities without changing non-anthropic facts about the world.
I’m uncertain whether what you’re saying is relevant. The question at hand is, is there some change to a problem that changes anthropic probabilities, but is guaranteed not to change ADT decisions? Such a change would have to conserve the number of worlds, the number of people in each world, the possible utilities, and the “no anthropics” probabilities
For example, if my anthropic knowledge says that I’m an agent at a specific point in time, a change in how long Sleeping Beauty stays awake in different “worlds” will change how likely I am to find myself there overall.
Is there? It would require some sort of evidence that would change your own anthropic probabilities, but that would not change the opinion of any outside observer if they saw it.
Doesn’t feel like that would work… if you remember how long you’ve been awake, that makes you into slightly different agents, and if the duration of the awakening gives you any extra info, it would show up in ADT too. And if you forget how long you’re awake, that’s just sleeping beauty with more awakenings...
Define “individual impact” as the belief that your own actions have no correlations with those of your copies (the belief your decisions control all your copies is “total impact”). Then ADT basically has the following equivalences:
ADT + selfless or total utilitarian = SIA + individual impact (= SSA+ total impact)
ADT + average utilitarian = SSA + individual impact
ADT + selfish = SSA + individual impact + complications (e.g. with precommitments)
If those equivalences are true, it seems that we cannot vary the anthropic probabilities without varying the ADT decision.
EDIT: Expanded first point a bit.
Hm. One could try and fix it by splitting each point in time into different “worlds,” like you suggest below. But the updating from time (let’s assume there’s no clock to look at, so the curves are smooth) would rely on the subjective probabilities, which you are avoiding. The update ratio is P(feels like 4 hours | heads) / P(feels like 4 hours). If P(feels like 4 hours | X) is 0.9 if X is heads and 0.8 if X is tails, then if the probabilities are 1⁄3 the ratio will be 1.08, while if the probabilities are 1⁄2, 1⁄4, 1⁄4 the update is a factor of 1.059.
This does lead to a case a bit more complicated than my original examples, though, because the people in different worlds will make different decisions. I’m not even sure how ADT would handle this situation, since it has to avoid the subjective probabilities—do you respond like an outside observer, and use 0.5, 0.5 for everything?
Yes, that would be reasonable.
Those only hold if things are simple. To say “these might prevent things from getting any more complicated” is to put the cart before the horse.
ADT does not avoid subjective probabilities—it only avoid anthropic probabilities. P(feels like 4 hours | heads) is perfectly fine. ADT only avoids probabilities that would change if you shifted from SIA to SSA or vice versa.
It is exactly one of those probabilities.
Can you spell out the full setup?
Okay, so let’s say you’re given some weak evidence which world you’re in—for example, if you’re asked the question when you’ve been awake for 4 hours if the coin was Tails vs. awake for 3.5 hours if Heads. In the Doomsday problem, this would be like learning facts about the earth that would be different if we were about to go extinct vs. if it wasn’t (we know lots of these, in fact).
So let’s say that your internal chronometer is telling you that if “feels like it’s been 4 hours” when you’re asked the question, but you’re not totally sure—let’s say that the only two options are “feels like it’s been 4 hours” and “feels like it’s been 3.5 hours,” and that your internal chronometer is correctly influenced by the world 75% of the time. So P(feels like 4 | heads) = 0.25, P(feels like 3.5 | heads) = 0.75, and vice versa for tails.
A utility-maximizing agent would then make decisions based on P(heads | feels like 4 hours) - but an ADT agent has to do something else. In order to update on the evidence, an ADT agent can just weight the different worlds by the update ratio. For example, if told that the coin is more likely to land heads than tails, an ADT agent successfully updates in favor of heads.
However, what if the update ratio also depended on the anthropic probabilities (that is, SIA vs. SSA)? That would be bad—we couldn’t do the same updating thing . If our new probability is P(A|B), Bayes’ rule says that’s P(A)*P(B|A)/P(B), so the update ratio is P(B|A)/P(B). The numerator is easy—it’s just 0.75 or 0.25. Does the denominator, on the other hand, depend on the anthropic probabilities?
If we look at the odds ratios, then P(A|B)/P(¬A|B)=P(A)/P(¬A) * P(B|A)/P(B|¬A). So as long as we have P(B|A) and P(B|¬A), it seems to work exactly as usual.
Good idea. Though since it’s a ratio, you do miss out on a scale factor—In my example, you don’t know whether to scale the heads world by 1⁄3 or the tails world by 3. Or mess with both by factors of 3⁄7 and 9⁄7, who knows?
Scaling by the ratio does successfully help you correct if you want to compare options between two worlds—for example, if you know you would pay 1 in the tails world, you now know you would pay 1⁄3 in the heads world. But if you don’t know something along those lines, that missing scale factor seems like it would become an actual problem.
The scale ratio doesn’t matter—you can recover the probabilities from the odds ratios (and the fact that they must sum to one).
I think you’re confusing the odds ratio (P(A)/P(¬A) * P(B|A)/P(B|¬A)), which ADT can’t touch, with the update on the odds ratio (P(B|A)/P(B|¬A)), which has to be used with a bit more creativity.
Thanks!
Thanks luke!
Yes. My paper goes into the Doomsday problem, and what I basically show is that it’s very hard to phrase it in terms of anthropic decision theory. To over-simplify, if you’re selfish, there may be a doomsday problem, but you wouldn’t care about it; if you’re selfless, you’d care about it, but you wouldn’t have one. You can design exotic utilities that allow the doomsday argument to go through, but they are quite exotic.
Read the paper to get an idea what I mean, then feel free to ask me any questions you want.
Stuart, thanks for contributing here. I’ve posted a couple of discussion threads myself on anthropic reasoning and the Doomsday Argument, and your paper has been mentioned in them, which caused me to read it.
I’m interested how you would like to apply ADT in Big World cases, where e.g. there are infinitely many civilizations of observers. Some of them expand off their home planets, others don’t, and we are trying to estimate the fraction (limiting frequency) of civilizations that expand, when conditioning on the indexical evidence that we’re now living on our home planet. It seems to me that aggregate utility functions (either “selfless” or additive utility altruist) won’t give sensible results here—you just end up comparing infinity against infinity and can’t make decisions. Whereas “selfish” or average utility altruist functions will give Doomish conclusions, for the reasons discussed in your paper.
See http://lesswrong.com/lw/9ma/selfindication_assumption_still_doomed/ and in particular my recent questions to endoself here http://lesswrong.com/lw/9ma/selfindication_assumption_still_doomed/6gaq
I’m genuinely interested in how different Decision Theories handle the updating in the infinite case (or if there is no formal updating, how the agents bet at different stages).
A final thing is that I’m puzzled by your claim that selfish and average utility altruist agents won’t care about Doom, and so it won’t affect their decisions. Won’t average utilitarians worry about the negative utility (pain, distress) of agents who are going to face Doom, and consider actions which will mitigate that pain? Won’t selfish agents worry about facing Doom themselves, and engage in survivalist “prepping” (or if that’s going to be no use at all, party like there’s no tomorrow)?
I don’t know how to deal with infinite ethics, and I haven’t looked into that in detail. ADT was not designed with that in mind, and I think we must find specific ways of extending these types of theories to infinite situations. And once they are found, we can apply them to ADT (or to other theories).
Though on a person note, I have to say that non-standard reals are cool.
I’m partial to the surreals myself. Every ordered field is a subfield of the surreals, though this is slightly cheating since the elements of a field form a set by definition but there are also Fields, which have elements forming a proper class. The surreals themselves are usually a Field, depending on your preference of uselessly abstract set theory axioms. We know that we want utilities to form an ordered field (or maybe a Field?), but Dedekind completeness for utilities seems to violate our intuitions about infinite ethics.
I haven’t studied the hyperreals, though. Is there any reason that you think they might be useful (the transfer principle?) or do you just find them cool as a mathematical structure?
They allow us to extend real-valued utilities, getting tractable “infinities” in at least some cases.
I think that hyper-reals (or non-standard reals) would model a universe of finite, but non-standard, size. In some cases, this would be deemed a reasonable model for an infinite universe. But in this case, I don’t think they help. The difficulty is still with SIA (or decision theories/utility functions with the same effect as SIA).
SIA will always shift weight (betting weight, decision weight) towards the biggest universe models available in a range of hypotheses. So if we have a range containing universe models with both standard finite and non-standard finite (hyper real) sizes, SIA will always cause agents to bet on the non-standard ones, and ignore the standard ones. And it will further cause betting on the “biggest” non-standard sizes allowed in the range. If our models have different orders of non-standard real - R, R^2, R^3 etc. where R is bigger than each standard real—then it will shift weight up to the highest order models allowed. If there are no highest orders in the range, then we get an improper probability distribution which can’t be normalised, not even using a non-standard normalisation constant. Finally, if our range contains any models of truly infinite size (infinite cardinality of galaxies, say) then since these are bigger than all the non-standard finite sizes, the betting weight shifts entirely to those. So the non-standard analysis may not help much.
Generally, this is my biggest bug-bear with SIA; it forces us to deal explicitly with the infinite case, but then ducks out when we get to that case (can’t compare infinities in a meaningful way; sorry).
Stuart, thanks for your comments on the infinite case by the way. I agree it is not obvious how to treat it, and one strategy is to avoid it completely. We could model infinite worlds by suitable big finite worlds (say with 3^^^3 galaxies) and assign zero prior probability to anything bigger. SIA will now shove all the weight up to those big finite worlds, but at least all utilities and populations are still finite and all probability distributions normalise.
This is roughly how it would go: and then we try to take a limit construction by making the 3^^^3 a variable N and looking at a limiting decision as N goes towards infinity. If we’re still making the same decisions asymptotically, then we declare those the correct decisions in the strictly infinite case. Sounds promising.
There are a couple of problems though. One is that this version of SIA will favour models where most star systems develop civilizations of observers, and then most of those civilizations go Doom. The reason is that such models maximize the number of observers who observe a universe like what we are observing right now, and hence become favoured by SIA. We still get a Doomsday argument.
A second problem is that this version will favour universe models which are even weirder, and packed very densely with observers (inside computer simulations, where the computers fill the whole of space and time). In those dense models, there are many more agents with experiences like ours (they are part of simulations of sparsely-populated universes) than there are agents in truly sparse universes. So SIA now implies a form of the simulation argument, and an external universe outside the simulation which looks very different from the one inside.
And now, perhaps even worse, among these dense worlds, some will use their simulation resources simulating mostly long-lived civilizations, while others will use the same resources simulating mostly short-lived civilizations (every time a simulation Dooms, they start another one). So dense worlds which simulate short-lived civilizations spend more of their resources simulating people like us, and generally contain more agents with experiences like ours, than dense worlds which simulate long-lived civilizations. So we STILL get a Doomsday Argument, on top of the simulation argument.
Hyper-reals ugly. No good model. Use Levi-Civita field instead. Hahn series also ok.
(me know math, but me no know grammar)
Surreals also allow this and they are more general, as the hyperreals and the Levi-Civita field are subfields of the surreals.
OK, the “surreals” contain the transfinite ordinals, hence they contain the infinite cardinals as well. So, surreals can indeed model universes of strictly infinite size i.e. not just non-standard finite size.
I think the SIA problem of weighting towards the “largest possible” models still applies though. Suppose we have identified two models of an infinite universe; one says there are aleph0 galaxies; the other says there are aleph1 galaxies. Under SIA, the aleph1 model gets all the probability weight (or decision weight).
If we have a range of models with infinities of different cardinalities, and no largest cardinal (as in Zermelo Fraenkel set theory) then the SIA probability function becomes wild, and in a certain sense vanishes completely. (Given any cardinal X, models of size X or smaller have zero probability.)
Yes, this doesn’t solve the problem of divergence of expected utility, it just lets us say that our infinite expected utilities are not converging rather than only having arbitrarily large real utilities fail to converge.
I was under the impression that you could represent all hyperreals as taking limits (though not the other way around), is that wrong? They could still be useful if they simplify the analysis a good deal, though.
I was simplifying when I said “didn’t care”. And if there’s negative utility around, things are different (I was envisaging the doomsday scenario as something along the lines of painless universal sterility). But let’s go with your model, and say that doomsday will be something painful (slow civilization collapse, say). How will average and total altruists act?
Well, an average altruist would not accept an increase in the risk of doom in exchange for other gains. The doom is very bad, and would mean a small population, so the average badness is large. Gains in the case where doom doesn’t happen would be averaged over a very large population, so would be less significant. The average altruist is willing to sacrifice a lot to avoid doom (but note this argument needs doom=small population AND bad stuff).
What about the total altruist? Well, they still don’t like the doom. But for them the benefits in the “no doom” scenario are not diluted. They would be willing to run a slight increase in the risk of doom, in exchange of some benefit to a lot of people in the no-doom situation. They would turn on the reactor that could provide limitless free energy to the whole future of the human species, even if there was small risk of catastrophic meltdown.
So the fact these two would reason differently is not unexpected. But what I’m trying to get to is that there is no simple single “doomsday argument” for ADT. There are many different scenarios (where you need to specify the situation, the probabilities, the altruisms of the agents, and the decisions they are facing), and in some of them, something that resembles the classical doomsday argument pops up, and in others it doesn’t.