In that case, after every game, 1 in 6 of you die in the A scenario, and 0 in the B scenario, but in either scenario there are still plenty of “you”s left, and so SSA would say you shouldn’t increase your credence in B (provided you remove your corpses from your reference class, which is perfectly fine a la Bostrom).
Can you spell that out more formally? It seems to me that so long as I’m removing the corpses from my reference class, 100% of people in my reference class remember surviving every time so far just like I do, so SSA just does normal bayesian updating.
The reference class you are proposing for this problem, just you, is even narrower than the trivial reference class (which includes everybody in your exact same epistemic situation so that you couldn’t tell which one you are.) It’s arguably not the correct reference class, given that even the trivial reference class is often too narrow.
I did mean to use the trivial reference class for the SSA assesment, just not in a large world. And, it still seems strange to me that it would change the conclusion here how large the world is. So even if you get this to work, I don’t think it reproduces my intuition. Besides, if the only reason we successfully learn from others is that we defined our reference class to include them—well, then the assumption we can’t update against is just “what reference class were in”. I’d similarly count this as a non-solution thats just hard-wiring the right answer.
Can you spell that out more formally? It seems to me that so long as I’m removing the corpses from my reference class, 100% of people in my reference class remember surviving every time so far just like I do, so SSA just does normal bayesian updating.
Sure, as discussed for example here: https://www.lesswrong.com/tag/self-sampling-assumption, if there are two theories, A and B, that predict different (non-zero) numbers of observers in your reference class, then on SSA that doesn’t matter. Instead, what matters is what fraction of observers in your reference class have the observations/evidence you do. In most of the discussion from the above link, those fractions are 100% on either A or B, resulting, according to SSA, in your posterior credences being the same as your priors.
This is precisely the situation we are in for the case at hand, namely when we make the assumptions that:
The reference class consists of all survivors like you (no corpses allowed!)
The world is big (so there are non-zero survivors on both A and B).
So the posteriors are again equal to the priors and you should not believe B (since your prior for it is low).
I did mean to use the trivial reference class for the SSA assesment, just not in a large world. And, it still seems strange to me that it would change the conclusion here how large the world is.
I completely agree, it seems very strange to me too, but that’s what SSA tells us. For me, this is just one illustration of serious problems with SSA, and an argument for SIA.
If your intuition says to not believe B even if you know the world is small then SSA doesn’t reproduce it either. But note that if you don’t know how big the world is you can, using SSA, conclude that you now disbelieve the combination small world + A, while keeping the odds of the other three possibilities the same—relative to one another—as the prior odds. So basically you could now say: I still don’t believe B but I now believe the world is big.
Finally, as I mentioned, I don’t share your intuition, I believe B over A if these are the only options. If we are granting that my observations and memories are correct, and the only two possibilities are: I just keep getting incredibly lucky OR “magic”, then with every shot I’m becoming more and more convinced in magic.
In most of the discussion from the above link, those fractions are 100% on either A or B, resulting, according to SSA, in your posterior credences being the same as your priors.
For the anthropic update, yes, but isn’t there still a normal update? Where you just update on the gun not firing, as an event, rather than your existence? Your link doesn’t have examples where that would be relevant either way. But if we didn’t do this normal updating, then it seems like you could only learn from an obervation if some people in your reference class make the opposite observation in different worlds. So if you use the trivial reference class, you will give everything the same probability as your prior, except for eliminating worlds where noone has your epistemic state and renormalizing. You will expect to violate bayes law even in normal situations that dont involve any birth or death. I don’t think thats how its meant to work.
You have described some bizarre issues with SSA, and I agree that they are bizarre, but that’s what defenders of SSA have to live with. The crucial question is:
For the anthropic update, yes, but isn’t there still a normal update?
The normal updates are factored into the SSA update. A formal reference would be the formula for P(H|E) on p.173 of Anthropic Bias, which is the crux of the whole book. I won’t reproduce it here because it needs a page of terminology and notation, but instead will give an equivalent procedure, which will hopefully be more transparently connected with the normal verbal statement of SSA, such as one given in https://www.lesswrong.com/tag/self-sampling-assumption:
SSA: All other things equal, an observer should reason as if they are randomly selected from the set of all actually existent observers (past, present and future) in their reference class.
That link also provides a relatively simple illustration of such an update, which we can use as an example:
Notice that unlike SIA, SSA is dependent on the choice of reference class. If the agents in the above example were in the same reference class as a trillion other observers, then the probability of being in the heads world, upon the agent being told they are in the sleeping beauty problem, is ≈ 1⁄3, similar to SIA.
In this case, the reference class is not trivial, it includes N + 1 or N + 2 observers (observer-moments, to be more precise; and N = trillion), of which only 1 or 2 learn that they are in the sleeping beauty problem. The effect of learning new information (that you are in the sleeping beauty problem or, in our case, that the gun didn’t fire for the umpteenth time) is part of the SSA calculation as follows:
Call the information our observer learns E (in the example above E = you are in the sleeping beauty problem)
You go through each possibility for what the world might be according to your prior. For each such possibility i (with prior probability Pi) you calculate the chance Qi of having your observations E assuming that you were randomly selected out of all observers in your reference class (set Qi = 0 if there no such observers).
In our example we have two possibilities: i = A, B, with Pi = 0.5. On A, we have N + 1 observers in the reference class, with only 1 having the information E that they are in the sleeping beauty problem. Therefore, QA = 1 / (N + 1) and similarly QB = 2 / (N + 2).
We update the priors Pi based on these probabilities, the lower the chance Qi of you having E in some possibility i, the stronger you penalize it. Specifically, you multiply Pi by Qi. At the end, you normalize all probabilities by the same factor to make sure they still add up to 1. To skip this last step, we can work with odds instead.
In our example the original odds of 1:1 then update to QA:QB, which is approximately 1:2, as the above quote says when it gives “≈ 1/3” for A.
So if you use the trivial reference class, you will give everything the same probability as your prior, except for eliminating worlds where noone has your epistemic state and renormalizing. You will expect to violate bayes law even in normal situations that dont involve any birth or death. I don’t think thats how its meant to work.
In normal situations using the trivial class is fine with the above procedure with the following proviso: assume the world is small or, alternatively, restrict the class further by only including observers on our Earth, say, or galaxy. In either case, if you ensure that at most one person, you, belongs to the class in every possibility i then the above procedure reproduces the results of applying normal Bayes.
If the world is big and has many copies of you then you can’t use the (regular) trivial reference class with SSA, you will get ridiculous results. A classic example of this is observers (versions of you) measuring the temperature of the cosmic microwave background, with most of them getting correct values but a small but non-zero number getting, due to random fluctuations, incorrect values. Knowing this, our measurement of, say, 2.7K wouldn’t change our credence in 2.7K vs some other value if we used SSA with the trivial class of copies of you who measured 2.7K. That’s because even if the true value was, say, 3.1K there would still be a non-zero number of you’s who measured 2.7K.
To fix this issue we would need to include in your reference class whoever has the same background knowledge as you, irrespective of whether they made the same observation E you made. So all you’s who measured 3.1K would then be in your reference class. Then the above procedure would have you severely penalize the possibility i that the true value is 3.1K, because Qi would then be tiny (most you’s in your reference class would be ones who measured 3.1K).
But again, I don’t want to defend SSA, I think it’s quite a mess. Bostrom does an amazing job defending it but ultimately it’s really hard to make it look respectable given all the bizarre implications imo.
That link also provides a relatively simple illustration of such an update, which we can use as an example:
I didn’t consider that illustrative of my question because “I’m in the sleeping beauty problem” shouldn’t lead to a “normal” update anyway. That said I haven’t read Anthropic Bias, so if you say it really is supposed to be the anthropic update only then I guess. The definition in terms of “all else equal” wasn’t very informative for me here.
To fix this issue we would need to include in your reference class whoever has the same background knowledge as you
But background knowledge changes over time, and a change in reference class could again lead to absurdities like this. So it seems to me like the sensible version of this would be to have your reference class always be “agents born with the same prior as me”, or indentical in an even stronger sense, which would lead to something like UDT.
Now that I think of it SSA can reproduce SIA, using the reference class of “all possible observers”, and considering existence a contingent property of those observers.
Learning that “I am in the sleeping beauty problem” (call that E) when there are N people who aren’t is admittedly not the best scenario to illustrate how a normal update is factored into the SSA update, because E sounds “anthropicy”. But ultimately there is not really much difference between this kind of E and the more normal sounding E* = “I measured the CMB temperature to be 2.7K”. In both cases we have:
Some initial information about the possibilities for what the world could be: (a) sleeping beauty experiment happening, N + 1 or N + 2 observers in total; (b) temperature of CMB is either 2.7K or 3.1K (I am pretending that physics ruled out other values already).
The observation: (a) I see a sign by my bed saying “Good morning, you in the sleeping beauty room”; (b) I see a print-out from my CMB apparatus saying “Good evening, you are in the part of spacetime where the CMB photons hit the detector with energies corresponding to 3.1K ”.
In either case you can view the observation as anthropic or normal. The SSA procedure doesn’t care how we classify it, and I am not sure there is a standard classification. I tried to think of a possible way to draw the distinction, and the best I could come up with is:
Definition (?). A non-anthropic update is one based on an observation E that has no (or a negligible) bearing on how many observers in your reference class there are.
I wonder if that’s the definition you had in mind when you were asking about a normal update, or something like it. In that case, the observations in 2a and 2b above would both be non-anthropic, provided N is big and we don’t think that the temperature being 2.7K or 3.1K would affect how many observers there would be. If, on the other hand, N = 0 like in the original sleeping beauty problem, then 2a is anthropic.
Finally, the observation that you survived the Russian roulette game would, on this definition, similarly be anthropic or not depending on who you put in the reference class. If it’s just you it’s anthropic, if N others are included (with N big) then it’s not.
The definition in terms of “all else equal” wasn’t very informative for me here.
Agreed, that phrase sounds vague, I think it can simply be omitted. All SSA is trying to say really is that P(E|i), where i runs over all possibilities for what the world could be, is not just 1 or 0 (as it would be in naive Bayes), but is determined by assuming that you, the agent observing E, is selected randomly from the set of all agents in your reference class (which exist in possibility i). So for example if half such agents observe E in a given possibility i, then SSA instructs you to set the probability of observing E to 50%. And in the special case of a 0⁄0 indeterminacy it says to set P(E|i) = 0 (bizarre, right?). Other than that, you are just supposed to do normal Bayes.
What you said about leading to UDT sounds interesting but I wasn’t able to follow the connection you were making. And about using all possible observers as your reference class for SSA, that would be anathema to SSAers :)
Definition (?). A non-anthropic update is one based on an observation E that has no (or a negligible) bearing on how many observers in your reference class there are.
Not what I meant. I would say anthropic information tells you where in the world you are, and normal information tell you what the world is like. An anthropic update, then, reasons about where you would be, if the world were a certain way, to update on world-level probabilities from anthropic information. So sleeping beauty with N outsiders is a purely anthropic update by my count. Big worlds generally tend to make updates more anthropic.
What you said about leading to UDT sounds interesting but I wasn’t able to follow the connection you were making.
One way to interpret the SSA criterion is to have beliefs in such a way that in as many (weighed by your prior) worlds as possible, you would as right as possible in the position of an average member of your reference class. If you “control” the beliefs of members in your reference class, then we could also say to believe in such a way as to make them as right as possible in as many worlds as possible. “Agents which are born with my prior” (and maybe “and using this epistemology”, or some stronger kind of identicalness) is a class whichs beliefs are arguably controlled by you in the timeless sense. So if you use it, you will be doing a UDT-like optimizing. (Of course, it will be a UDT that believes in SSA.)
And about using all possible observers as your reference class for SSA, that would be anathema to SSAers :)
Maybe, but if there is a general form that can produce many kinds of anthropics based on how its free parameter is set, then calling the result of one particular value of the parameter SIA and the results of all others SSA does not seem to cleave reality at the joints.
Can you spell that out more formally? It seems to me that so long as I’m removing the corpses from my reference class, 100% of people in my reference class remember surviving every time so far just like I do, so SSA just does normal bayesian updating.
I did mean to use the trivial reference class for the SSA assesment, just not in a large world. And, it still seems strange to me that it would change the conclusion here how large the world is. So even if you get this to work, I don’t think it reproduces my intuition. Besides, if the only reason we successfully learn from others is that we defined our reference class to include them—well, then the assumption we can’t update against is just “what reference class were in”. I’d similarly count this as a non-solution thats just hard-wiring the right answer.
Sure, as discussed for example here: https://www.lesswrong.com/tag/self-sampling-assumption, if there are two theories, A and B, that predict different (non-zero) numbers of observers in your reference class, then on SSA that doesn’t matter. Instead, what matters is what fraction of observers in your reference class have the observations/evidence you do. In most of the discussion from the above link, those fractions are 100% on either A or B, resulting, according to SSA, in your posterior credences being the same as your priors.
This is precisely the situation we are in for the case at hand, namely when we make the assumptions that:
The reference class consists of all survivors like you (no corpses allowed!)
The world is big (so there are non-zero survivors on both A and B).
So the posteriors are again equal to the priors and you should not believe B (since your prior for it is low).
I completely agree, it seems very strange to me too, but that’s what SSA tells us. For me, this is just one illustration of serious problems with SSA, and an argument for SIA.
If your intuition says to not believe B even if you know the world is small then SSA doesn’t reproduce it either. But note that if you don’t know how big the world is you can, using SSA, conclude that you now disbelieve the combination small world + A, while keeping the odds of the other three possibilities the same—relative to one another—as the prior odds. So basically you could now say: I still don’t believe B but I now believe the world is big.
Finally, as I mentioned, I don’t share your intuition, I believe B over A if these are the only options. If we are granting that my observations and memories are correct, and the only two possibilities are: I just keep getting incredibly lucky OR “magic”, then with every shot I’m becoming more and more convinced in magic.
For the anthropic update, yes, but isn’t there still a normal update? Where you just update on the gun not firing, as an event, rather than your existence? Your link doesn’t have examples where that would be relevant either way. But if we didn’t do this normal updating, then it seems like you could only learn from an obervation if some people in your reference class make the opposite observation in different worlds. So if you use the trivial reference class, you will give everything the same probability as your prior, except for eliminating worlds where noone has your epistemic state and renormalizing. You will expect to violate bayes law even in normal situations that dont involve any birth or death. I don’t think thats how its meant to work.
You have described some bizarre issues with SSA, and I agree that they are bizarre, but that’s what defenders of SSA have to live with. The crucial question is:
The normal updates are factored into the SSA update. A formal reference would be the formula for P(H|E) on p.173 of Anthropic Bias, which is the crux of the whole book. I won’t reproduce it here because it needs a page of terminology and notation, but instead will give an equivalent procedure, which will hopefully be more transparently connected with the normal verbal statement of SSA, such as one given in https://www.lesswrong.com/tag/self-sampling-assumption:
That link also provides a relatively simple illustration of such an update, which we can use as an example:
In this case, the reference class is not trivial, it includes N + 1 or N + 2 observers (observer-moments, to be more precise; and N = trillion), of which only 1 or 2 learn that they are in the sleeping beauty problem. The effect of learning new information (that you are in the sleeping beauty problem or, in our case, that the gun didn’t fire for the umpteenth time) is part of the SSA calculation as follows:
Call the information our observer learns E (in the example above E = you are in the sleeping beauty problem)
You go through each possibility for what the world might be according to your prior. For each such possibility i (with prior probability Pi) you calculate the chance Qi of having your observations E assuming that you were randomly selected out of all observers in your reference class (set Qi = 0 if there no such observers).
In our example we have two possibilities: i = A, B, with Pi = 0.5. On A, we have N + 1 observers in the reference class, with only 1 having the information E that they are in the sleeping beauty problem. Therefore, QA = 1 / (N + 1) and similarly QB = 2 / (N + 2).
We update the priors Pi based on these probabilities, the lower the chance Qi of you having E in some possibility i, the stronger you penalize it. Specifically, you multiply Pi by Qi. At the end, you normalize all probabilities by the same factor to make sure they still add up to 1. To skip this last step, we can work with odds instead.
In our example the original odds of 1:1 then update to QA:QB, which is approximately 1:2, as the above quote says when it gives “≈ 1/3” for A.
In normal situations using the trivial class is fine with the above procedure with the following proviso: assume the world is small or, alternatively, restrict the class further by only including observers on our Earth, say, or galaxy. In either case, if you ensure that at most one person, you, belongs to the class in every possibility i then the above procedure reproduces the results of applying normal Bayes.
If the world is big and has many copies of you then you can’t use the (regular) trivial reference class with SSA, you will get ridiculous results. A classic example of this is observers (versions of you) measuring the temperature of the cosmic microwave background, with most of them getting correct values but a small but non-zero number getting, due to random fluctuations, incorrect values. Knowing this, our measurement of, say, 2.7K wouldn’t change our credence in 2.7K vs some other value if we used SSA with the trivial class of copies of you who measured 2.7K. That’s because even if the true value was, say, 3.1K there would still be a non-zero number of you’s who measured 2.7K.
To fix this issue we would need to include in your reference class whoever has the same background knowledge as you, irrespective of whether they made the same observation E you made. So all you’s who measured 3.1K would then be in your reference class. Then the above procedure would have you severely penalize the possibility i that the true value is 3.1K, because Qi would then be tiny (most you’s in your reference class would be ones who measured 3.1K).
But again, I don’t want to defend SSA, I think it’s quite a mess. Bostrom does an amazing job defending it but ultimately it’s really hard to make it look respectable given all the bizarre implications imo.
I didn’t consider that illustrative of my question because “I’m in the sleeping beauty problem” shouldn’t lead to a “normal” update anyway. That said I haven’t read Anthropic Bias, so if you say it really is supposed to be the anthropic update only then I guess. The definition in terms of “all else equal” wasn’t very informative for me here.
But background knowledge changes over time, and a change in reference class could again lead to absurdities like this. So it seems to me like the sensible version of this would be to have your reference class always be “agents born with the same prior as me”, or indentical in an even stronger sense, which would lead to something like UDT.
Now that I think of it SSA can reproduce SIA, using the reference class of “all possible observers”, and considering existence a contingent property of those observers.
Learning that “I am in the sleeping beauty problem” (call that E) when there are N people who aren’t is admittedly not the best scenario to illustrate how a normal update is factored into the SSA update, because E sounds “anthropicy”. But ultimately there is not really much difference between this kind of E and the more normal sounding E* = “I measured the CMB temperature to be 2.7K”. In both cases we have:
Some initial information about the possibilities for what the world could be: (a) sleeping beauty experiment happening, N + 1 or N + 2 observers in total; (b) temperature of CMB is either 2.7K or 3.1K (I am pretending that physics ruled out other values already).
The observation: (a) I see a sign by my bed saying “Good morning, you in the sleeping beauty room”; (b) I see a print-out from my CMB apparatus saying “Good evening, you are in the part of spacetime where the CMB photons hit the detector with energies corresponding to 3.1K ”.
In either case you can view the observation as anthropic or normal. The SSA procedure doesn’t care how we classify it, and I am not sure there is a standard classification. I tried to think of a possible way to draw the distinction, and the best I could come up with is:
Definition (?). A non-anthropic update is one based on an observation E that has no (or a negligible) bearing on how many observers in your reference class there are.
I wonder if that’s the definition you had in mind when you were asking about a normal update, or something like it. In that case, the observations in 2a and 2b above would both be non-anthropic, provided N is big and we don’t think that the temperature being 2.7K or 3.1K would affect how many observers there would be. If, on the other hand, N = 0 like in the original sleeping beauty problem, then 2a is anthropic.
Finally, the observation that you survived the Russian roulette game would, on this definition, similarly be anthropic or not depending on who you put in the reference class. If it’s just you it’s anthropic, if N others are included (with N big) then it’s not.
Agreed, that phrase sounds vague, I think it can simply be omitted. All SSA is trying to say really is that P(E|i), where i runs over all possibilities for what the world could be, is not just 1 or 0 (as it would be in naive Bayes), but is determined by assuming that you, the agent observing E, is selected randomly from the set of all agents in your reference class (which exist in possibility i). So for example if half such agents observe E in a given possibility i, then SSA instructs you to set the probability of observing E to 50%. And in the special case of a 0⁄0 indeterminacy it says to set P(E|i) = 0 (bizarre, right?). Other than that, you are just supposed to do normal Bayes.
What you said about leading to UDT sounds interesting but I wasn’t able to follow the connection you were making. And about using all possible observers as your reference class for SSA, that would be anathema to SSAers :)
Not what I meant. I would say anthropic information tells you where in the world you are, and normal information tell you what the world is like. An anthropic update, then, reasons about where you would be, if the world were a certain way, to update on world-level probabilities from anthropic information. So sleeping beauty with N outsiders is a purely anthropic update by my count. Big worlds generally tend to make updates more anthropic.
One way to interpret the SSA criterion is to have beliefs in such a way that in as many (weighed by your prior) worlds as possible, you would as right as possible in the position of an average member of your reference class. If you “control” the beliefs of members in your reference class, then we could also say to believe in such a way as to make them as right as possible in as many worlds as possible. “Agents which are born with my prior” (and maybe “and using this epistemology”, or some stronger kind of identicalness) is a class whichs beliefs are arguably controlled by you in the timeless sense. So if you use it, you will be doing a UDT-like optimizing. (Of course, it will be a UDT that believes in SSA.)
Maybe, but if there is a general form that can produce many kinds of anthropics based on how its free parameter is set, then calling the result of one particular value of the parameter SIA and the results of all others SSA does not seem to cleave reality at the joints.