Of all the SIA-doomsdays in the all the worlds...
Ideas developed with Paul Almond, who kept on flogging a dead horse until it started showing signs of life again.
Doomsday, SSA and SIA
Imagine there’s a giant box filled with people, and clearly labelled (inside and out) “(year of some people’s lord) 2013”. There’s another giant box somewhere else in space-time, labelled “2014″. You happen to be currently in the 2013 box.
Then the self-sampling assumption (SSA) produces the doomsday argument. It works approximately like this: SSA has a preference for universe with smaller numbers of observers (since it’s more likely that you’re one-in-a-hundred than one-in-a-billion). Therefore we expect that the number of observers in 2014 is smaller than we would otherwise “objectively” believe: the likelihood of doomsday is higher than we thought.
What about the self-indication assumption (SIA) - that makes the doomsday argument go away, right? Not at all! SIA has no effect on the number of observers expected in the 2014, but increases the expected number of observers in 2013. Thus we still expect that the number of observers in 2014 to be lower than we otherwise thought. There’s an SIA doomsday too!
Enter causality
What’s going on? SIA was supposed to defeat the doomsday argument! What happens is that I’ve implicitly cheated—by naming the boxes “2013” and “2014″, I’ve heavily implied that these “boxes” figuratively correspond two subsequent years. But then I’ve treated them as independent for SIA, like two literal distinct boxes.
In reality, of course, the contents of two years are not independent. Causality connects the two: it’s much more likely that there are many observers in 2014, if there were many in 2013. Indeed, most of the observers will exist in both years (and there will be some subtle SSA issues of “observer moments” that we’re eliding here—does a future you count as the same observer or a different one). So causality removes the independence assumption, and though there may be some interesting SIA effects on changes in growth rates, we won’t see a real SIA doomsday.
Exit causality
But is causality itself a justified assumption? To some extent it is: some people who think they live in 2013/2014 will certainly be in a causal relationship with people who think they live in 2014/2013.
But many will not! What of Boltzmann brains? What of Boltzmann worlds—brief worlds that last less than a year?
What about deluded worlds: worlds where the background data coincidentally (or conspiratorially) imply that we exist on the planet Earth around the Sun, in 2013 - but where we really exist inside a star circling a space-station a few seconds after the seventh toroidal big bang, or something, and will soon wake up to this fact. What about simply deluded people—may we be alien nutjobs, dreaming we’re humans? And of great relevance to arguments often presented here, what if we are short-term simulations run by some advanced species?
All these are possible, with non-zero probability (the probability of simulations may be very high indeed, under some assumptions). All of these break the causal link to 2014 or other future events. And hence all of them allow a genuine SIA doomsday argument to flourish: we should expect that seeing 2014 is less likely than is objectively implied, given that we think we are in the year 2013.
Is anthropic decision theory (ADT) subject to the same doomsday argument? In that form, no. In any situation where causality breaks down, your decisions cease to have consequences, and ADT tosses them aside. But more complicated preferences or setups could bring doomsday into ADT as well.
I’m having trouble understanding the setup and how the conclusions follow from the two assumptions in the first section.
As I understand, I know that I’m in the 2013 box—do I also know the number of others in the 2013 box? I had thought so, but then I’m not sure why I’m reasoning about the “expected number of observers in 2013” in the SIA paragraph. Shouldn’t my knowledge of the true number screen off my expectations?
I have heard this around FHI before, Nick seems unbothered by such utterances. Nevertheless, I think this false to some extent. This is true: (S)SSA ⊨ (A ⊢ DA), A being some still open claims about the reference class. SSA produces a much weaker version of DA, which might not deserve this name and using it can be misleading. On the other hand, claiming DA was refuted by (S)SSA is also misleading. Between these two misleads, Nick seems to have chosen the former. His choice made sense from an utilitarian-memetic perspective, 10 years ago, among philosophers. However, I would prefer people already concerned with x-risks to use another terminology, which avoided both misleads.
I didn’t understand this. Why would the reference class be the universe you belong to? The reference class ought to be composed by all epistemically indistinguishable individuals(let’s call these EII), modulo all beliefs with no relevance to the questions being assessed. I think we have discussed my views on this last time I was here (I’m Joao, btw). I believe there are some grounds for claiming that (S)SSA entails you are in the universe with the biggest number of copies of epistemically indistinguishable individuals like you(let’s call these the class Z universes). On the other hand, if there are much more non-Z universes containing EII like you, as to counterbalance the fact there are many more of EII like you on the Z universes, then it can be the case is much more likely you are not on a Z universe. If we are talking about possible worlds in the logical space, instead of possible universes, then the Z class seems to be composed by one member. This is so because, iff a world has the maximum number EII like you, then it is full, it exhausted the logical possible permutations, there are no more degrees of freedom where one could put more of anything else. This seems to be a good argument to say (S)SSA does not entail an absurdity, for then it becomes clear the class Z (or the descending classes (Y, X..) regarding the amount of EII like you) is much smaller than the classes of worlds where the frequency of EII like you is not oddly huge. If we are talking about physical possible universes, then I’m not sure how to prevent absurdity. But it is still the case it seems much more likely you are on an universe with a lot of members of your reference class (or EII like you) than in one with few of them, provided the reference class cuts between universes—as I think it should, in most scenarios. In the end, physical universes or light-cones are much less relevant than the reference class. Rational observers live in the latter, atoms in the former. Whereas we live in an evolutionary mess.
*(S)SSA should probably mean SSSA, the one I’m sure where my arguments make sense. I think they also work on SSA, but I’m not sure.
Didn’t katja grace have a post on this 2 years ago? Sadly the archive function of her blog isn’t great, but I’ll try to dig it up.
EDIT: Aha, found it. Two points to Katja!
Thanks for that link! Katja’s setup is somewhat similar, but not identical. The emphasis on causality is mainly what distinguishes the two—a causal prior undoes Katja’s doomsday.
Isn’t Katja’s argument that there is an upper bound on the size of the future box? Assume that the future box is labelled “Colonize Universe”, rather than “2014″, then her great filter argument is that very few planets can ever get to the colonize box. So when SIA increases the estimated size of the 2013 box, it also reduces the estimated probability of getting from 2013 to colonize. How does the causal prior undermine this argument?
Her great filter argument does that, but it’s not a direct SIA argument—we use extra observations (the fact we don’t see other aliens) and some assumptions about similarity across different species, to get that result.
OK, so your point is that while SIA itself favours hypotheses which make the 2013 box bigger, THOSE favoured hypotheses (very large multiverse, lots of habitable planets) also tend to make the box 2014 bigger. So we don’t change the expected ratio Size(2014)/Size(2013).
It is only if we have additional evidence constraining Size(2014) - or Size(Colonize Universe) - that we get a ratio shift.
Causality means that the 2013 box being bigger makes the 2014 box bigger. If they are independent, there are many priors where making the 2013 box bigger makes the 2014 box bigger.
Katja’s argument is that SIA makes the “human-like civilization” box bigger. But observations say that the “colonise the universe” box isn’t big. Hence we can conclude that the causality breaks down between the two boxes—that human-like civilizations don’t tend to colonise the universe.
The argument is a bit more dependent on your choice of priors (see http://lesswrong.com/lw/1zj/sia_wont_doom_you/ ) but that’s the gist of it.
Maybe I’m dumb, so explain this to me like I’m 5: how is anthropic reasoning of this sort not generalizing from an example of 1? I happened to be born into this particular era. That says nothing about the prior probability of being born into this era, just that I was. Even if there is a P(10^-50) chance of living on Earth in the 21st century (as opposed to all other times and places where humans lived and will live), that probability still represents 7 billion someones.
It’s like a stranger appearing out of nowhere, saying ‘Congrats! you won the lottery!’ and handing you a $100 bill, then disappearing, and you trying to work out what the odds must have been for the lottery. You can’t; you have no way of knowing. All you know is that you won—but it is completely unwarranted to make any assumptions about the odds from that.
That’s exactly what it is.
Wrong. A sample of 1 (the fact that you are in this era), can give you a surprisingly large amount of information about the sample space (people born, sorted by era). The chance of your sample of 1 being in the first 10^-50 of the sample space is pretty small. See also: the German tank problem
Does not apply. The Allies witnessed more than one tank, from which they were able to infer a sequential numbering scheme, and thus derive all sorts of information about German manufacturing capability and total armaments.
What if the Allies’ statisticians just had one representative tank sample, with the number “12” stamped on it. What could they infer then? Nothing. Maybe it’s a sequential serial number and they should infer that Germany only has ~24 tanks. Or maybe they should take as a prior that Hitler would send his older tanks into battle first, in which case you’d expect an early serial number and who knows how many tanks there are. Maybe it’s a production number and there are 12 different models of this tank class and unknown production runs of each. Maybe “12″ just means it was made in December, or from factory #12.
Ok, some of these examples are specific to the tank problem and don’t generalize to anthropic reasoning, but the point still stands: it is bad epistemology to generalize from one example. The only definitively valid anthropic conclusions is that there is at least one representative sample, not zero (i.e., that we live in a universe and at a specific time where human beings do exist). For the doomsday hypothesis, that tells us nothing.
(Another form of the anthropic principle applies to the Fermi paradox, but in this case we can infer other sample points from not seeing evidence of extraterrestrial intelligence in our local neighborhood. This is a very different line of reasoning, and does not apply here as far as I can tell.)
EDIT: To be clear, the specific point where the German tank problem analogy falls apart, is in its underlying assumption: that the tank (us) was selected at random from a pool of armaments (possible birthdays) with a uniform distribution. This is a completely unwarranted assumption. If, on the other hand, reincarnation were real and timeless, then you could look at “past” (future?) lives and start doing statistical analysis. But we don’t have that luxury, alas.
One data point cannot tell you literally nothing. If it did, then by induction, any finite number of data points would also tell you literally nothing. In most cases, a single data point tells you very little, because it is somewhat rare for people to be considering different hypotheses in which a single observed data point is orders of magnitude more likely in one hypothesis than in the other. The doomsday argument is an exception: It is 10^10 times as likely for a randomly selected human to be one of the first tenth of humans who have ever lived than it is for a randomly selected human to be one of the first 10^-11 of humans who have ever lived.
How is it warranted to assume that you are a priori much more likely to be in extremely unusual circumstances than a randomly selected human is?
What I said was:
An inference requires more than one data point. Let me give you a number: 7. Care to tell me what pattern this came from?
Your extra data points are summarized by your prior: you assume that your existence was randomly selected from the range of all possible human existences over all time, and then using that prior to reason about the doomsday paradox. I am saying that this prior has absolutely no rational basis (unless you are a theist and believe in [re-]incarnation).
You keep repeating this, but your only defense of it consists of examples of situations in which a single data point does not give you much information. This does not show that a single data point can never give you a significant amount of information. I have already explained how in the doomsday argument, a single data point does give you a lot of information, but your response was simply to repeat your claim and provide another example in which a single data point gives you less information.
Sure. 7 is a very common number for people to pick when they try to come up with an arbitrary number, so this is significant (though not overwhelming) evidence that you made up an arbitrary number, as opposed to, for instance, using a random number generator, or counting something in particular, etc. There’s been plenty of research into human random number generation, so this tells me quite a bit about what other numbers tend to be generated by the same process that you used to generate the number 7. For instance, 3 is also quite common, and even numbers and multiples of 5 are rare.
I’m not sure what you mean by that. A prior shouldn’t count as a data point, and if you’re counting a prior as a data point anyway, then even in your examples of 1 data point being insufficient to draw confident conclusions from, we actually have more than 1 data point, since I have a prior about those, too.
What do you suggest instead? Given only the information that you are a human, do you give more than 50% credence that you are one of the first 50% of humans to be born? Wouldn’t that seem rather absurd?
I don’t see what God or reincarnation could possibly have to do with this topic.
While I disagreed with your other comment, I’m inclined to agree with this one. “The amount of people to ever live” feels like a rather arbitrary and ill-defined number, and the argument’s Wikipedia article has a number of objections which I agree with and which follow similar lines: we don’t know that the number of people to ever live is necessarily finite, human population seems more likely to be exponentially than uniformly distributed, and the whole argument gets rather paradoxal if you apply it to itself.
It feels like the kind of thing that’s a solid argument if you accept its premises, but then there’s no particular to reason to expect that the premises would be correct.
I don’t understand. What does it mean to apply the doomsday argument to itself?
(from here)
(from here)
Hm, interesting. It is a bit unfair to talk about what the first person to formulate the doomsday argument could have used it to predict, for the same reason it is unfair to talk about what the first people to exist could have used the doomsday argument to predict: being the person who comes up with an idea that becomes popular is rare, and we expect probabilistic arguments to fail in rare edge cases. Also, the self-referencing doomsday argument rebuttal offers a counter-rebuttal of itself: It is more recent and has been considered by fewer people than the doomsday argument has, so it will probably have a shorter lifespan. (yes, I thought of the obvious counter-counter-rebuttal.)
The main point that I get out of those examples is that the Doomsday Argument is really a fully general argument that can be applied to pretty much anything. You can apply it to itself, or I could apply it to predict how many days of life I still have left, or for how long I will continue to remain employed (either at my current job, or in general), or to how many LW comments I am yet to write...
A claim like “my daughter just had her first day of school, and if we assume that she’s equally likely to find herself in any position n of her total amount of lifetime days in school N, then it follows that there’s a 95% chance that she will spend a maximum of 20 days of her life going to school” would come off as obviously absurd, but I’m not sure why the Doomsday Argument would be essentially any different.
It’s possible to argue that it is appropriate to use SIA in some of those examples, but SSA for the duration of the human race.
The doomsday argument doesn’t say that, even if you do use SSA with the reference class of days that your daughter is in school. You’re confusing the likelihood of the evidence given the hypothesis with the posterior probability of the hypothesis given the evidence.
That summarized it better than I could. Thank you!
I think you’re wrong. If you have no idea of what sample space looks like, that chance isn’t “pretty small”, it’s unknown.
The chance of being in the first 10^-50 of the sample space is, in fact, precisely 10^-50, which is pretty small.
You are assuming a uniform distribution across the sample space. That assumption seems to be hanging in mid-air: there is absolutely nothing supporting it.
Not assuming. Defining. The measure on the sample space that I was referring to when I said “first 10^-50” is the probability distribution on the sample space. No other measure of the sample space has even been mentioned. I really was not saying anything that’s not totally tautological.
Edit: Or are you referring back to the Doomsday argument and suggesting that you are not, a priori, a randomly selected human?
That’s not what people usually mean when they say things like “the first 1% of the sample space”. When they want to talk about the probability distribution on the sample space they tend to say “the first 1% of the population”.
How do you know what the probability distribution in the sample space is, anyway?
If no other measure on the sample space has been mentioned, they aren’t likely to be referring to anything else. Anyway, it’s what I was referring to.
You don’t, but whatever it is, you have a 10^-50 chance of being in the first 10^-50 of it.
Then I don’t understand what was the point of your original comment.
I said: “If you have no idea of what sample space looks like, that chance isn’t “pretty small”, it’s unknown.”
Your example where you happen to know not just the sample space but actually the probability distribution seems entirely irrelevant.
I was not trying to imply that the probability distribution on the sample space was known, simply that whatever it is, the probability of being in the first 10^-50 of it is 10^-50. The entire point of the doomsday argument (and much of Bayesian statistics, for that matter) is that if the probability distribution on the sample space is not known, you can use your data point to update your priors over them.
Under SSA we expect to be in the universe with fewer observers which would mean that the other universe ‘(2014’) is likely to have more observers (since we are according to SSA in the smaller universe and we are not in ’2014′). I am not sure why you are saying that SSA would produce the Doomsday Argument or that we’d expect ’2014′ to be smaller.
Under SIA we expect that we are in the box with more observers which would mean that ’2014′ has less observers and that’s about it.
I got confused thinking the different boxes are different universes (which made me apply a different reasoning) but I now realize that you mean that all people from both boxes are from your reference class in this scenario (which can explain the SSA reasoning). In this case..
I don’t get what you mean here.
Nor here.
SIA doesn’t really make other people in your reference class who are somewhere else (in some box) less likely unless I am missing something.
...based on the number of people we observe in 2013.
That still doesn’t mean anything to me. Do you mean lower than the number of people we observe in 2013? It sounds like you mean lower that what we will project/estimate based on the number we see in 2013 but using what method?
I am sorry but this adds close to 0 clarity on what you mean but as I said SIA does not decrease the number of people who (might) exist in your reference class under normal conditions and the only difference between normal conditions here is that some of the people in your reference class are put in some box with some label. And putting the unknown quantity of people in your reference class who might be you in a box doesn’t change anything—they can be in a box, on another planet, in your basement or wherever. (Again I should add ’unless I am missing something)
Without SIA, we could assume, a priori, that there would be roughly equal amounts of people in 2013 and 2014. SIA increases the number of people in 2013. So we now expect there to be more people in 2013 than 2014, or a diminishing of population from 2013 to 2014: a doomsday effect. This is especially pronounced if we assume there is a fixed numbers of observers, so moving more to 2013 actively reduces those in 2014.
Causality removes this effect, since an increase in 2013 also causes an expected increase in 2014. But in cases without causality, it is present. If we are short term simulations in 2013, we should expect that there are less short-term simulations in 2014, hence that we are slightly less likely to have simulations that continue our lives into the next year.
Under UDASSA there is no problem, right? Adding more observers to a given year cancels exactly, because it takes more information to specify an individual observer the more of them there are. Meanwhile earlier years are marginally more likely because they’re easier to specify.
I guess there is still a doomsday though—we should expect log (year of humanity’s existence) in which we find ourselves to be close to the average of such. But that’s a much more palatable answer, at least to me.
Under UDASSA, if there are N observers (or observer moments), then it takes an average of log N + C bits to localise any one of them. C here is an overhead representing the length of program needed to identify the observers or observer moments. (Think of the program as creating a list of N candidates, and then a further log N bits are needed to pick one item from the list.)
So this favours hypotheses making N small—which gives a Doomsday argument.
The extra complication is that there may be less overhead when using different programs to identify different sorts of observer moment (corresponding to different reference classes). Say there are M observers in “our” reference class, and “our” reference class is identified by a program of length C-O.
Then provided C-O + log M < C + log N, UDASSA will now favour hypotheses which make M as small as possible: it says nothing about the size of N.
You still get a doomsday argument in that “observers like us” are probably doomed, but it is less exciting, because of the possibility of “observers like us” evolving into an entirely different reference class.
But the additional observers exactly cancel the extra data needed to identify a specific one, no? The length of the program that identifies the class of people alive in 2013 is the same however many people are alive in 2013. So the size of N is irrelevant and we expect to find ourselves in classes for which C is small.
That would be true in an SIA approach (probability of a hypothesis is scaled upwards in proportion to number of observers). It’s not true in an SSA approach (there is no upscaling to counter the additional complexity penalty of locating an observer out of N observers). This is why SSA tends to favour small N (or small M for a specific reference class).
If I’m reading ASSA correctly, then it still has a doomsday argument (and UD doesn’t change this).
I am surprised that some smart people have been discussing the doomsday argument seriously for decades. The whole premise is that “supposing the humans alive today are in a random place in the whole human history timeline” assumes that we can apply the idea of probabilities and probability distributions to a one-off event. This particular piece of math works when applied to statistical inferences, “the process of drawing conclusions from data that are subject to random variation”, not to a single data point with no way to repeat the experiment. There is no reason to think, or to check, that this particular map corresponds to this particular territory. To quote Maslow, “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail,”, and the doomsday argument is a good example of trying to hit a not-nail with a hammer.
I find doomsday arguments to be inherently flawed, and only taken seriously because they’re relatively novel. Seriously, imagine a doomsday cult using statistical arguments from 50,000 years ago, arguing that the human race had at most 1,000 years left, because smaller populations are “more likely” than larger ones, and they’d have been assigned to the population of billions rather than thousands.
(Of course, we -could- be delusional. But then, if you’re making this point, effectively you’re trying to argue that empiricism is faulty. It might be, of course, but such an argument, and any dependent upon it, is without value, as it has no predictive power.)
I’m not entirely persuaded by doomsday arguments (yet) either, but it does seem that there’s an easy answer to your point about the ancient cult. Suppose doom is coming soon for us, but people throughout history have been believing falsely that the doomsday argument applied to them. In that case still the vast majority of people who applied doomsday reasoning throughout time will turn out to have been correct, because now there are so many of us!
(P.S. I don’t think I understood the point about us being delusional—where did that come from?)
The delusional part was a reference to a comment in the op about deluded worlds, where empiricism isn’t valid regardless of all else. Imagine, for example, a universe in which you’re the only person, living an elaborate delusion that you’re actually interacting with other minds. The entire argument collapses in meaninglessness.
And a quick Google search suggests currently living humans only constitute around 10% of all human beings which have ever lived, so there’s a 90% chance that any given believer is incorrect. At what point it becomes correct depends on population dynamics which I hesitate to forecast into the future (except that it’s going to remain valid for at least the next century, provided the birth rate doesn’t rise considerably).
But none of this gives you any additional information about whether or not doomsday is in fact correct—even if it were accurate about the population, and it told you that most people who believe in doomsday are correct, it doesn’t give you any information about whether or not you personally are correct. There’s no new information contained there to update your existing knowledge upon.
You didn’t prove anything. How is our situation now (compared with a galaxy-wide human civilization) different than the caveman (vs. today)? How do we know the ‘vast majority’ of humans are not still in the future, yet to be born?
The difference is that we exist, whereas a galaxy-wide human civilization may or may not exist in the future.
We don’t know that, but this is what the doomsday argument provides evidence for.
In general, statistical arguments only work in most cases, not in all cases, so the fact that the first 1% of people would have been misled by the doomsday argument does not show that the doomsday argument is flawed.
That’s exactly my point—the caveman could make the same argument: “we hunter-gather tribes exist, whereas a planet-wide civilization of 7 billion human beings may or may not exist in the future.”
Doomsday will come, whether it is around the corner or 120 billion years from now in the heat-death of the universe. At some point some group of humans will correct in believing the end is nigh. But that is not to say that by the OP’s anthropic reasoning they would be justified in believing so. Rather, a broken clock is still right twice a day.
And your point has been countered with “although the caveman would have been wrong, most people in history who made the argument would have been right, for which reason the caveman would have been justified in making that argument”, which you haven’t addressed.
If you have to choose between guessing “this fair six-sided die will produce a six on its next roll” or “this fair six-sided die will produce a non-six on its next roll”, then the latter alternative is the right one to pick since it maximizes your chance of being correct. Yes, you know for certain that this guess will turn out to be wrong in one sixth of the cases, but that doesn’t mean that the math of “choosing non-six maximizes your chance of being right” would be wrong.
Similarly, we know that of everyone who could make the Doomsday Argument, some percentage (depending on the specifics of the argument) will always be wrong, but that doesn’t mean that the math of the Doomsday Argument would be wrong.
Hmm… good point!
We know stuff the caveman didn’t, so P(exactly n people will ever live | what we know) != P(exactly n people will ever live | what the caveman knew).