I’ve been thinking about Anthropic Arguments recently, and noticed a disturbing lack of hard hitting (as in, can be shown to be true or false given our knowledge of physics and such) thought experiments on the topic.
All other things equal, an observer should reason as if they are randomly selected from the set of all actually existent observers (past, present and future) in their reference class.
In this post, I’m using the reference class of “Algorithms identical to you, but perhaps in different times and places”- that seems the least objectionable. This includes exact simulated copies of you and exact physical duplicates of you.
The self sampling assumption claims that our experiences are probabilistic evidence about the observations of people in the reference class we’re in. If H1 predicts that observers like us exist and a rather large proportion of observers, and H2 predicts that observers like us exist but are a very small proportion of observers, ceteris paribus we can choose H1 over H2. Our experiences provide meaningful evidence in this pretty unobjectionable and intuitive way.
There are real world consequences of accepting this sort of reasoning, some of which punish us with physics verified irrationality- we will act in ways that we can verify in advance and in retrospect cannot be based on any communicated information, and with more contrived examples can verifiably make us incorrectly update based on specific information more often than we correctly update, even if we know all the relevant information.
This allows us to glimpse at the underlying problems with our intuitions about some sorts of anthropic reasoning.
Consider the following hypothetical:
There are a million exactly perfect copies of you (including you, of course) throughout the universe, all of which are non-local to one another- their futures do not intersect.
Now, consider the following two hypotheses to which we’ve each assigned probability exactly .5 for simplicity:
Scenario 1: All of your copies (including you) are exactly identical and will always be exactly identical. Their entire observable universe has been and always will be precisely the same, and they will not see a bright pink pop-up on their screen roughly a second after they read the sentence below these two hypotheses.
Scenario 2: All of your copies (including you) are exactly identical until roughly a second after they read the sentence below these two hypotheses, at which point all but one of the copies get a bright pink pop-up on their screen saying “You were one of the 999,999!”
If you don’t see a bright pink pop-up on your screen, should you update in favor of Hypothesis 1 and against Hypothesis 2? In a way, you’re more likely to have not seen the pop-up if you’re in scenario 1.
So, you’re in Scenario 1, right? In fact, this naively seems (at least to me) to be very fair.
However:
You can presumably do something like this in real life! Make a bunch of probes with the information necessary to run exactly identical simulations of you in an exactly identical environment which will evolve exactly the same into the indefinite future, and have them fall out of your cosmological horizon.
If any of the probes have seen <event x> by <time x> (You can basically ignore the time part of this, I’m including time to make the argument more intuitive), have them begin to tile their light cones with computational substrate prepared to run copies of you which will see a pink pop-up appear five seconds into the simulation.
At <time x + y> where y is enough time for the probes who saw <event x> to build enough computers for you to agree that the fact you don’t see a pink pop-up appear provides strong justified evidence that none of the probes saw <event x>, begin all of the simulations, including the ones on the probes that didn’t see <event x>.
If you don’t see a pop-up, and you think this somehow allows you to justifiably update in favor of no probes having seen <event x>, you’re claiming that your brain is corresponding to phenomena entirely non-local to you.
The no pop-up you can’t update on which happened based on the fact they’re a no pop-up person (except in the trivial way that they can infer that at least one probe didn’t see <event x> by <time x> with ~1 probability, because the probes who see <event x> by <time x> show the pop-ups to all of their copies with almost zero error), and the pop-up people can update towards ~1 probability of at least one of the probes having seen <event x> by <time x>.
If you don’t see the pop-up, you will update exactly the same way, regardless of which happened. That’s what’s actually happening- there’s no change to how you update based on the thing you’re trying to update on, so you’re not doing meaningful cognitive work.
The other members of your reference class don’t need to be outside of your cosmological horizon either. If your method makes you update your probabilities in some way which does not respect locality, (For example, allowing you to update faster than the speed of light) it can’t work.
In fact, you can communicate across time too with very similar schemes. And universes, branches of the universal wavefunction, heavenly spheres, etc. Your model isn’t respecting position in general. This is reductio ad absurdum of this entire idea (Namely, the self sampling assumption and the broader zone of intuitions like this).
The fact that you’re experiencing or not experiencing something cannot be treated as evidence for how likely it was for copies of you to experience or not experience that thing. You can’t reason as if you were randomly selected from the pool of all actually existent observers in your reference class- if you do, you can draw conclusions about what your reference class looks like in weird physics breaking ways.
If you look closely at an actual example of this effect, we can tell that the self sampling assumption doesn’t allow you to gain any information you don’t already have. You could always change the reference class, but to what? I think you need to change it to “exactly you” in order to avoid making unjustified claims about information you don’t have access to, which defeats the entire purpose.
There are even sketchier versions of this “type” of reasoning such as the Doomsday Argument. The Doomsday Argument can’t allow you to justifiably update for very similar reasons.
What if...
“What if, before you halt your simulation and copy yourself into the probes, you conclude that <event x> is going to be observed by at least one probe with probability 99%. What probability estimate should you use for seeing a pink pop-up? Shouldn’t it be really high?”
Yes, but I would already believe that. Anthropic reasoning isn’t giving me any evidence.
“But if you don’t see a pink pop-up, are you going to change your estimate that at least one probe saw the event, with probability 99%?”
I can’t. If I claim that I’m making a justified update using this principle of reasoning, then I’m claiming I’m breaking the laws of physics into tiny bits using nothing but this super sketchy principle of reasoning. I’m going to keep my probabilities exactly the same- it turns out that this literally isn’t evidence, at least to the no pop-up copies of me.
“That seems silly.”
It does, but it seems less silly than breaking locality. I’m not going to update in ways that I don’t think correspond with the evidence, and in this situation, while it seems that my update would correspond with the evidence, it can’t.
“Well, why is it silly? You’ve shown that this plausible-sounding type of Anthropic Reasoning is wrong- but where did it go wrong? What if it works for other scenarios?”
Well, how did we decide it was plausible in the first place? I have a sneaking suspicion that the root of this confusion is a questionable view of personal identity and questionable handling of reference classes. I’ll write something about these in a later post… hopefully.
Pretty Normal?
Disclaimer: I’m going to say P = 1 and P = 0 for some things. This isn’t actually true, of course- yada yada, the piping of flutes, reality could be a lie. At least, I wouldn’t know if it were true. I don’t think constantly keeping in mind the fact that we could all be in some Eldritch God’s dream and the other sorts of obscure nonsense required for the discussed probabilities to be wrong is useful here, so go away.
Also, yes, the universe is seemingly deterministic and that’s in conflict with some of the wording here, but the idea still applies. Something something self locating uncertainty in many worlds.
If you survive, you’re going to downgrade your probability that the cold war was dangerous. Seems clear enough...
But, wouldn’t we always have updated in the direction of the cold war being less dangerous? Whenever we do this update, we’re always going to update in the direction of the cold war being less dangerous- we need to have survived to perform the update. I’m never going to update against the probability of something having been safe in this way. Updates are only being done in one direction, regardless of the true layout of reality...
We already know how we’re going to update.
Because of the weird selection effects behind our updates, we already know that whenever an update is being done, by these rules, we’re updating towards the world having been safe.
“But, my errors are still decreasing on average, right?”
You will have the lowest retrospective rates of inaccuracy if you conclude thatP(Survival)=1, because you’re never going to gather evidence contrary to that notion. You’re not gaining evidence about the actual retrospective probability, though- the cold war could have had P(Apocalypse)≈1 and you’d make the exact same conclusion based on this particular evidence.
Every sentient being throughout all of space and time is always deciding that things were safer, using this principle. This particular evidence is already perfectly correlated with you existing- you’re not learning anything.
Imagine that the world is either safe (low risk of existential catastrophe) or dangerous (high risk of existential catastrophe). Then A1 would argue that P( safe | we survived ) is the same as P( safe ): our survival provides no evidence of the world being safe.
A1 does not claim that P(safe | we survived)=P(safe). A1 is talking about how we should reason given that we have observed that we have survived.
If you still think this is splitting hairs, consider the difference between
P(We will observe that we did not survive) and P(We will not survive).
P(We will observe that we did not survive)=0, of course.
A1 claims that P(We survived | We're talking about this)=1
As far as our updates are concerned,P(We observe that we survived)=1
As a consequence,∀x∈X,P( x | we observe that we survived)=P( x ) for the set X of all possibilities, at least from our perspective. The fact that we observe that we survived provides us no additional evidence for anything, from our perspective.
You hopefully see the problem.
It’s not evidence that us surviving was likely or unlikely. It’s not even evidence, at least to us.
A2: “If we had won the lottery, a lottery-losing us wouldn’t be around to observe it. Hence we can’t conclude anything from our loss about the odds of winning the lottery.”
We do not always update in favor of the idea that we won the lottery- winning the lottery isn’t as highly entangled with whether we’re updating as whether or not we’re still alive.
I can’t predict how I’m going to update about how likely it was for me to win the lottery in advance of the lottery results. Whenever I win or lose the lottery, I’m learning something I couldn’t have already incorporated into my priors (Well, I could have. But think about a Quantum lottery, perhaps)- something I didn’t know before. Evidence.
A Broader Problem.
There’s a broader problem with being surprised about the things you observe, but I’m not sure where to draw the line. Everyone reading this post should be unusually surprised about where they find themselves- you are in a very unusual reference class.
But… you are in a highly unusual reference class. You’re not some magical essence plopped down in a randomized vessel- you’re literally the algorithm that you are. You cannot be surprised by being yourself- it is highly unsurprising. What’s the alternative? You couldn’t have been someone else- you couldn’t have been born after we develop AGI, or be an AGI yourself, or be an alien, or born in ancient Greece. Otherwise, you would be them, and not you. It doesn’t make sense to talk about being something other than you- there is no probability going on.
There’s a silly reverse side to this coin. “Well, you shouldn’t be surprised you’re finding yourself in this situation, no matter how improbable.” isn’t an explanation either. You can’t use the fact that you shouldn’t be surprised you’re finding yourself in this situation to play defense for your argument.
If I claim that the LHC has a 50% chance of instantly destroying all of reality every time it causes a collision, you can’t use “But that’s so unlikely! That can’t be the case, otherwise we most certainly wouldn’t be here!” as an argument. It sounds like you should be able to- that gets the obviously right answer! Unfortunately, “The LHC is constantly dodging destroying all of reality” really is “indistinguishable” from what we observe. Well, that’s not quite true- the two just can’t be distinguished using that particular argument.
However, I also can’t back my claim up by pointing that out. The fact that my claim is not technically incompatible with our observations is not substantial evidence.
In Conclusion:
Good Wholesome Cosmologist Anthropics is the idea that we shouldn’t adopt a model which predicts that we don’t exist. It’s a specific case of the “We shouldn’t adopt a model which predicts something which we know is not true” principle, which is itself a specific case of the “We should try to adopt accurate models” principle.
The arguments themselves might seem a little mysterious, but like Stuart Armstrong said, it’s normal.
Not Even Evidence
Introduction.
I’ve been thinking about Anthropic Arguments recently, and noticed a disturbing lack of hard hitting (as in, can be shown to be true or false given our knowledge of physics and such) thought experiments on the topic.
The self sampling assumption is as follows:
In this post, I’m using the reference class of “Algorithms identical to you, but perhaps in different times and places”- that seems the least objectionable. This includes exact simulated copies of you and exact physical duplicates of you.
The self sampling assumption claims that our experiences are probabilistic evidence about the observations of people in the reference class we’re in. If H1 predicts that observers like us exist and a rather large proportion of observers, and H2 predicts that observers like us exist but are a very small proportion of observers, ceteris paribus we can choose H1 over H2. Our experiences provide meaningful evidence in this pretty unobjectionable and intuitive way.
There are real world consequences of accepting this sort of reasoning, some of which punish us with physics verified irrationality- we will act in ways that we can verify in advance and in retrospect cannot be based on any communicated information, and with more contrived examples can verifiably make us incorrectly update based on specific information more often than we correctly update, even if we know all the relevant information.
This allows us to glimpse at the underlying problems with our intuitions about some sorts of anthropic reasoning.
Consider the following hypothetical:
There are a million exactly perfect copies of you (including you, of course) throughout the universe, all of which are non-local to one another- their futures do not intersect.
Now, consider the following two hypotheses to which we’ve each assigned probability exactly .5 for simplicity:
Scenario 1: All of your copies (including you) are exactly identical and will always be exactly identical. Their entire observable universe has been and always will be precisely the same, and they will not see a bright pink pop-up on their screen roughly a second after they read the sentence below these two hypotheses.
Scenario 2: All of your copies (including you) are exactly identical until roughly a second after they read the sentence below these two hypotheses, at which point all but one of the copies get a bright pink pop-up on their screen saying “You were one of the 999,999!”
If you don’t see a bright pink pop-up on your screen, should you update in favor of Hypothesis 1 and against Hypothesis 2? In a way, you’re more likely to have not seen the pop-up if you’re in scenario 1.
It initially seems fair to me to say:
P(Scenario 1 | No pop-up)=P(No pop-up | Scenario 1)∗P(Scenario 1)P(No pop-up)
=1∗.5/(.5∗1+.5∗1/1,000,000)
=1,000,000/1,000,001
P(Scenario 2 | No pop-up)=P(No pop-up | Scenario 2)∗P(Scenario 2)P(No pop-up)
=1/1,000,000∗.5/(.5∗1+.5∗1/1,000,000)
=1/1,000,001
So, you’re in Scenario 1, right? In fact, this naively seems (at least to me) to be very fair.
However:
You can presumably do something like this in real life! Make a bunch of probes with the information necessary to run exactly identical simulations of you in an exactly identical environment which will evolve exactly the same into the indefinite future, and have them fall out of your cosmological horizon.
If any of the probes have seen <event x> by <time x> (You can basically ignore the time part of this, I’m including time to make the argument more intuitive), have them begin to tile their light cones with computational substrate prepared to run copies of you which will see a pink pop-up appear five seconds into the simulation.
At <time x + y> where y is enough time for the probes who saw <event x> to build enough computers for you to agree that the fact you don’t see a pink pop-up appear provides strong justified evidence that none of the probes saw <event x>, begin all of the simulations, including the ones on the probes that didn’t see <event x>.
If you don’t see a pop-up, and you think this somehow allows you to justifiably update in favor of no probes having seen <event x>, you’re claiming that your brain is corresponding to phenomena entirely non-local to you.
The no pop-up you can’t update on which happened based on the fact they’re a no pop-up person (except in the trivial way that they can infer that at least one probe didn’t see <event x> by <time x> with ~1 probability, because the probes who see <event x> by <time x> show the pop-ups to all of their copies with almost zero error), and the pop-up people can update towards ~1 probability of at least one of the probes having seen <event x> by <time x>.
If you don’t see the pop-up, you will update exactly the same way, regardless of which happened. That’s what’s actually happening- there’s no change to how you update based on the thing you’re trying to update on, so you’re not doing meaningful cognitive work.
The other members of your reference class don’t need to be outside of your cosmological horizon either. If your method makes you update your probabilities in some way which does not respect locality, (For example, allowing you to update faster than the speed of light) it can’t work.
In fact, you can communicate across time too with very similar schemes. And universes, branches of the universal wavefunction, heavenly spheres, etc. Your model isn’t respecting position in general. This is reductio ad absurdum of this entire idea (Namely, the self sampling assumption and the broader zone of intuitions like this).
The fact that you’re experiencing or not experiencing something cannot be treated as evidence for how likely it was for copies of you to experience or not experience that thing. You can’t reason as if you were randomly selected from the pool of all actually existent observers in your reference class- if you do, you can draw conclusions about what your reference class looks like in weird physics breaking ways.
If you look closely at an actual example of this effect, we can tell that the self sampling assumption doesn’t allow you to gain any information you don’t already have. You could always change the reference class, but to what? I think you need to change it to “exactly you” in order to avoid making unjustified claims about information you don’t have access to, which defeats the entire purpose.
There are even sketchier versions of this “type” of reasoning such as the Doomsday Argument. The Doomsday Argument can’t allow you to justifiably update for very similar reasons.
What if...
“What if, before you halt your simulation and copy yourself into the probes, you conclude that <event x> is going to be observed by at least one probe with probability 99%. What probability estimate should you use for seeing a pink pop-up? Shouldn’t it be really high?”
Yes, but I would already believe that. Anthropic reasoning isn’t giving me any evidence.
“But if you don’t see a pink pop-up, are you going to change your estimate that at least one probe saw the event, with probability 99%?”
I can’t. If I claim that I’m making a justified update using this principle of reasoning, then I’m claiming I’m breaking the laws of physics into tiny bits using nothing but this super sketchy principle of reasoning. I’m going to keep my probabilities exactly the same- it turns out that this literally isn’t evidence, at least to the no pop-up copies of me.
“That seems silly.”
It does, but it seems less silly than breaking locality. I’m not going to update in ways that I don’t think correspond with the evidence, and in this situation, while it seems that my update would correspond with the evidence, it can’t.
“Well, why is it silly? You’ve shown that this plausible-sounding type of Anthropic Reasoning is wrong- but where did it go wrong? What if it works for other scenarios?”
Well, how did we decide it was plausible in the first place? I have a sneaking suspicion that the root of this confusion is a questionable view of personal identity and questionable handling of reference classes. I’ll write something about these in a later post… hopefully.
Pretty Normal?
Disclaimer: I’m going to say P = 1 and P = 0 for some things. This isn’t actually true, of course- yada yada, the piping of flutes, reality could be a lie. At least, I wouldn’t know if it were true. I don’t think constantly keeping in mind the fact that we could all be in some Eldritch God’s dream and the other sorts of obscure nonsense required for the discussed probabilities to be wrong is useful here, so go away.
Also, yes, the universe is seemingly deterministic and that’s in conflict with some of the wording here, but the idea still applies. Something something self locating uncertainty in many worlds.
Consider the following example:
P(dangerous | survival) = P(dangerous) * (P(survival | dangerous)/P(survival)).
If you survive, you’re going to downgrade your probability that the cold war was dangerous. Seems clear enough...
But, wouldn’t we always have updated in the direction of the cold war being less dangerous? Whenever we do this update, we’re always going to update in the direction of the cold war being less dangerous- we need to have survived to perform the update. I’m never going to update against the probability of something having been safe in this way. Updates are only being done in one direction, regardless of the true layout of reality...
We already know how we’re going to update.
Because of the weird selection effects behind our updates, we already know that whenever an update is being done, by these rules, we’re updating towards the world having been safe.
“But, my errors are still decreasing on average, right?”
You will have the lowest retrospective rates of inaccuracy if you conclude thatP(Survival)=1, because you’re never going to gather evidence contrary to that notion. You’re not gaining evidence about the actual retrospective probability, though- the cold war could have had P(Apocalypse)≈1 and you’d make the exact same conclusion based on this particular evidence.
Every sentient being throughout all of space and time is always deciding that things were safer, using this principle. This particular evidence is already perfectly correlated with you existing- you’re not learning anything.
A1 does not claim that P(safe | we survived)=P(safe). A1 is talking about how we should reason given that we have observed that we have survived.
If you still think this is splitting hairs, consider the difference between
P(We will observe that we did not survive) and P(We will not survive).
P(We will observe that we did not survive)=0, of course.
A1 claims that P(We survived | We're talking about this)=1
As far as our updates are concerned, P(We observe that we survived)=1
As a consequence,∀x∈X,P( x | we observe that we survived)=P( x ) for the set X of all possibilities, at least from our perspective. The fact that we observe that we survived provides us no additional evidence for anything, from our perspective.
You hopefully see the problem.
It’s not evidence that us surviving was likely or unlikely. It’s not even evidence, at least to us.
We do not always update in favor of the idea that we won the lottery- winning the lottery isn’t as highly entangled with whether we’re updating as whether or not we’re still alive.
I can’t predict how I’m going to update about how likely it was for me to win the lottery in advance of the lottery results. Whenever I win or lose the lottery, I’m learning something I couldn’t have already incorporated into my priors (Well, I could have. But think about a Quantum lottery, perhaps)- something I didn’t know before. Evidence.
A Broader Problem.
There’s a broader problem with being surprised about the things you observe, but I’m not sure where to draw the line. Everyone reading this post should be unusually surprised about where they find themselves- you are in a very unusual reference class.
But… you are in a highly unusual reference class. You’re not some magical essence plopped down in a randomized vessel- you’re literally the algorithm that you are. You cannot be surprised by being yourself- it is highly unsurprising. What’s the alternative? You couldn’t have been someone else- you couldn’t have been born after we develop AGI, or be an AGI yourself, or be an alien, or born in ancient Greece. Otherwise, you would be them, and not you. It doesn’t make sense to talk about being something other than you- there is no probability going on.
There’s a silly reverse side to this coin. “Well, you shouldn’t be surprised you’re finding yourself in this situation, no matter how improbable.” isn’t an explanation either. You can’t use the fact that you shouldn’t be surprised you’re finding yourself in this situation to play defense for your argument.
If I claim that the LHC has a 50% chance of instantly destroying all of reality every time it causes a collision, you can’t use “But that’s so unlikely! That can’t be the case, otherwise we most certainly wouldn’t be here!” as an argument. It sounds like you should be able to- that gets the obviously right answer! Unfortunately, “The LHC is constantly dodging destroying all of reality” really is “indistinguishable” from what we observe. Well, that’s not quite true- the two just can’t be distinguished using that particular argument.
However, I also can’t back my claim up by pointing that out. The fact that my claim is not technically incompatible with our observations is not substantial evidence.
In Conclusion:
Good Wholesome Cosmologist Anthropics is the idea that we shouldn’t adopt a model which predicts that we don’t exist. It’s a specific case of the “We shouldn’t adopt a model which predicts something which we know is not true” principle, which is itself a specific case of the “We should try to adopt accurate models” principle.
The arguments themselves might seem a little mysterious, but like Stuart Armstrong said, it’s normal.
Also, be aware of its sister: “We shouldn’t adopt models which predict that they are themselves wrong” a la Boltzmann Brains.
However, there’s two more… interesting… veins of Anthropic Arguments to go mining.
Not Even Evidence- The Self Sampling Assumption, Self Indicating Assumption, and so on suffer from this.
Recognizing that some stuff which seems like evidence is Not Even Evidence.