P(life is common|life on earth)=P(life is common), because knowing that life did evolve on earth can’t give us Bayesian evidence for or against the hypothesis that life is common.
That math is rather obviously wrong. You are so close here—just use Bayes.
We have 2 mutually exclusive models: life is common, and life is rare. To be more specific, lets say that the life is common theory posits that life is a 1 in 10 event, the life is rare theory posits that life is a 1 in a billion event.
Let’s say that our priors are P(life is common) = 0.09, and P(life is rare) = 0.91
Now, our observation history over this solar system tells us that life evolved on earth—and probably only complex life on earth, although there may be simple life on mars or some of the watery moons.
As we are just comparing two models, we can compare likelihoods
P(life is common | life on earth) ]= P(life on earth | life is common) P(life is common) = 0.09 * 0.1 = 0.009 ~ 10^-2
P(life is rare | life on earth) ]= P(life on earth | life is rare) P(life is rare) = 10^-9 * 0.91 ~ 10^-9
To convert to actual probabilities we would need to divide by P(life on earth), but that doesn’t really matter because it is a constant normalizing factor.
However, the motives of such a civilization are difficult to predict with any accuracy, so I suspect that the vast majority of possible hypotheses are things we haven’t even thought of yet. (unknown unknowns.) So, although your specific hypothesis becomes more likely if we are in a simulation, so do all other possible hypotheses predicting large numbers of simulations.
I agree with your general analysis here, although it is important to remember that the full hypothesis space is always infinite. For tractable inference, we focus on a small subset of the most promising theories/models.
When considering the wide space of potential simulators, we must focus on key abstractions. For example, we can focus on models in which advanced civs have convergent instrumental reasons for creating large numbers of simulations. I am currently aware of a couple of wide classes of models that predict lots of sims. Besides aliens simulating other aliens, our descendants could have strong motivations to simulate us—as a form of resurrection for example, in addition to the common motivator for improving world models. There is also the possibility of creating new artificial universes, in which case there may be interesting strong motivators to create lots of universes and lots of simulations as a precursor step.
(3) Panspermia / Abiogenesis: it sounds like “Life Before Earth” isn’t a mainstream consensus, based on a couple comments below.
No—that paper is not even really mainstream. I mentioned it as an example of the panspermia model and the resulting potentially expanded timeframe for the history of life. If life is really that old, then it becomes less likely that a single early elder civ colonized the galaxy early and dominated.
P(life is common|life on earth)=P(life is common), because knowing that life did evolve on earth can’t give us Bayesian evidence for or against the hypothesis that life is common.
That math is rather obviously wrong. You are so close here—just use Bayes.
Perhaps I should have used an approximately equal to symbol instead of an equals sign, to avoid confusion. And thanks for the detailed writeup. I would agree 100% if you substituted “planet X” for “earth”. Basically, I’m arguing that using ourselves as a data point is a form of the observational selection effect, just like survivorship bias.
Similarly, let’s suppose that we have a less discriminating test, mammography, that still has a 20% rate of false negatives, as in the original case. However, mammography has an 80% rate of false positives. In other words, a patient without breast cancer has an 80% chance of getting a false positive result on her mammography test. If we suppose the same 1% prior probability that a patient presenting herself for screening has breast cancer, what is the chance that a patient with positive mammography has cancer?
Group 1: 100 patients with breast cancer.
Group 2: 9,900 patients without breast cancer.
After mammography* screening:
Group A: 80 patients with breast cancer and a “positive” mammography*.
Group B: 20 patients with breast cancer and a “negative” mammography*.
Group C: 7920 patients without breast cancer and a “positive” mammography*.
Group D: 1980 patients without breast cancer and a “negative” mammography*.
The result works out to 80 / 8,000, or 0.01. This is exactly the same as the 1% prior probability that a patient has breast cancer! A “positive” result on mammography doesn’t change the probability that a woman has breast cancer at all. You can similarly verify that a “negative” mammography also counts for nothing. And in fact it must be this way, because if mammography has an 80% hit rate for patients with breast cancer, and also an 80% rate of false positives for patients without breast cancer, then mammography is completely uncorrelated with breast cancer.
In that example, the reason the posterior probability equals the prior probability is that the “test” isn’t causally linked with the cancer. You have to assume the same the same sort of thing for cases in which you are personally entangled. For example, if I watched my friend survive 100 rounds of solo Russian Roulette, then Baye’s theorem would lead me to believe that there was a high probability that the gun was empty or only had 1 bullet. However, if I myself survived 100 rounds, I couldn’t afterward conclude a low probability, because there would be no conceivable way for me to observe anything but 10 wins. I can’t observe anything if I’m dead.
Does what I’m saying make sense? I’m not sure how else to put it. Are you arguing that Baye’s theorem can still output good data even if you feed it skewed evidence? Or are you arguing that the evidence isn’t actually the result of survivorship bias/observation selection effect?
For example, if I watched my friend survive 100 rounds of solo Russian Roulette, then Baye’s theorem would lead me to believe that there was a high probability that the gun was empty or only had 1 bullet. However, if I myself survived 100 rounds, I couldn’t afterward conclude a low probability, because there would be no conceivable way for me to observe anything but 10 wins. I can’t observe anything if I’m dead.
Obviously you can’t observe anything if you are dead, but that isn’t interesting. What matter is comparing the various hypothesis that could explain the events.
The case where you yourself survive 100 rounds is somewhat special only in that you presumably remember whether you put bullets in or not and thus already know the answer.
Pretend, however that you suddenly wake up with total amensia. There is a gun next to you and a TV then shows a video of you playing 100 rounds of roulette and surviving—but doesn’t show anything before that (where the gun was either loaded or not).
What is the most likely explanation?
the gun was empty in the beginning
the gun had 1 bullet in the beginning
With high odds, option 1 is more likely. This survorship bias/observation selection effect issue you keep bringing up is completely irrelevant when comparing two rival hypothesis that both explain the data!
Here is another, cleaner and simpler example:
Omega rolls a fair die which has N sides. Omega informs you the roll comes up as a ‘2’. Assume Omega is honest. Assume that dice can be either 10 sided or 100 sided, in about the same ratio.
That math is rather obviously wrong. You are so close here—just use Bayes.
We have 2 mutually exclusive models: life is common, and life is rare. To be more specific, lets say that the life is common theory posits that life is a 1 in 10 event, the life is rare theory posits that life is a 1 in a billion event.
Let’s say that our priors are P(life is common) = 0.09, and P(life is rare) = 0.91
Now, our observation history over this solar system tells us that life evolved on earth—and probably only complex life on earth, although there may be simple life on mars or some of the watery moons.
As we are just comparing two models, we can compare likelihoods
P(life is common | life on earth) ]= P(life on earth | life is common) P(life is common) = 0.09 * 0.1 = 0.009 ~ 10^-2
P(life is rare | life on earth) ]= P(life on earth | life is rare) P(life is rare) = 10^-9 * 0.91 ~ 10^-9
To convert to actual probabilities we would need to divide by P(life on earth), but that doesn’t really matter because it is a constant normalizing factor.
I agree with your general analysis here, although it is important to remember that the full hypothesis space is always infinite. For tractable inference, we focus on a small subset of the most promising theories/models.
When considering the wide space of potential simulators, we must focus on key abstractions. For example, we can focus on models in which advanced civs have convergent instrumental reasons for creating large numbers of simulations. I am currently aware of a couple of wide classes of models that predict lots of sims. Besides aliens simulating other aliens, our descendants could have strong motivations to simulate us—as a form of resurrection for example, in addition to the common motivator for improving world models. There is also the possibility of creating new artificial universes, in which case there may be interesting strong motivators to create lots of universes and lots of simulations as a precursor step.
No—that paper is not even really mainstream. I mentioned it as an example of the panspermia model and the resulting potentially expanded timeframe for the history of life. If life is really that old, then it becomes less likely that a single early elder civ colonized the galaxy early and dominated.
Perhaps I should have used an approximately equal to symbol instead of an equals sign, to avoid confusion. And thanks for the detailed writeup. I would agree 100% if you substituted “planet X” for “earth”. Basically, I’m arguing that using ourselves as a data point is a form of the observational selection effect, just like survivorship bias.
As for the math, I’ll pull an example from An Intuitive Explanation of Bayes’ Theorem:
In that example, the reason the posterior probability equals the prior probability is that the “test” isn’t causally linked with the cancer. You have to assume the same the same sort of thing for cases in which you are personally entangled. For example, if I watched my friend survive 100 rounds of solo Russian Roulette, then Baye’s theorem would lead me to believe that there was a high probability that the gun was empty or only had 1 bullet. However, if I myself survived 100 rounds, I couldn’t afterward conclude a low probability, because there would be no conceivable way for me to observe anything but 10 wins. I can’t observe anything if I’m dead.
Does what I’m saying make sense? I’m not sure how else to put it. Are you arguing that Baye’s theorem can still output good data even if you feed it skewed evidence? Or are you arguing that the evidence isn’t actually the result of survivorship bias/observation selection effect?
Obviously you can’t observe anything if you are dead, but that isn’t interesting. What matter is comparing the various hypothesis that could explain the events.
The case where you yourself survive 100 rounds is somewhat special only in that you presumably remember whether you put bullets in or not and thus already know the answer.
Pretend, however that you suddenly wake up with total amensia. There is a gun next to you and a TV then shows a video of you playing 100 rounds of roulette and surviving—but doesn’t show anything before that (where the gun was either loaded or not).
What is the most likely explanation?
the gun was empty in the beginning
the gun had 1 bullet in the beginning
With high odds, option 1 is more likely. This survorship bias/observation selection effect issue you keep bringing up is completely irrelevant when comparing two rival hypothesis that both explain the data!
Here is another, cleaner and simpler example:
Omega rolls a fair die which has N sides. Omega informs you the roll comes up as a ‘2’. Assume Omega is honest. Assume that dice can be either 10 sided or 100 sided, in about the same ratio.
What is the more likely value of N?
100
10
Here is my solution:
priors: P(N=100) = 1, P(N=10) = 1
P(N=100 | roll(N) = 2) = P(roll(N)=2 | N=100) P(N=100) = 0.01
P(N=10 | roll(N) = 2) = P(roll(N)=2 | N = 10) P(N=10) = 0.1
So N=10 is 10 times more likely than N= 100.