Wuhan is a city of 11 million people; the world population is about 7.9 billion. Saying that the prior on a zoonotic origin is anything less than 11 million / 7.9 billion = 1.4/1000 means that you think people living in Wuhan are less likely to be patient 0 than the average person in the entire world.
The odds given in the OP are based on 3 coincidences:
Location
Timing
What kind of disease it was
Your number is only based on 1 of those coincidences (location). It is not surprising that the probability of one of those things is higher than the probability of all 3 at once.
“What kind of disease” has Bayes Factor 1. It’s exactly the kind of disease that has caused pandemics in the same region of the world within the past 20 years, and which comes from the kind of wild animal trade that has been known to be happening in Wuhan for years. I discussed this in the very next paragraph.
The timing is given by such a weak argument that I did ignore it, yes. WIV has been studying bat coronaviruses for years, and probably will continue to do so for years, and the only thing to tie it so closely in time is a rejected grant proposal that emphasized having the actual work done at UNC.
Actually, I think timing is actually evidence against the relevance of the grant proposal. I don’t think they could have done anything like creating Covid (which, based on everything I’ve heard, would have required vastly new and different techniques from what existed in 2018, and which is several thousand mutations away from the nearest known natural virus) in a year and a half.
The timing is given by such a weak argument that I did ignore it, yes. WIV has been studying bat coronaviruses for years, and probably will continue to do so for years, and the only thing to tie it so closely in time is a rejected grant proposal that emphasized having the actual work done at UNC.
This was in a section of the OP marked as an edit, so it’s possible this level of detail wasn’t there the first time you looked:
We must also account for the timing here. Each year in the modern period from, say, 1970 until today has a decently large chance of human-animal transmission, perhaps with some bias towards the present due to more travel. But gain of function is a new invention—it only really started in 2011 and funding was banned in 2014, then the moratorium was lifted in 2017. The 2011-2014 period had little or no coronavirus gain of function work as far as I am aware. So coronavirus gain of function from a lab could only have occurred after say 2010 and was most likely after 2017 when it had the combination of technology and funding. This is a period of about 2 years out of the entire 1920-2020 hundred-year window. Now, we could probably discount that hundred year window down to say an equivalent of 40 years as people have become more mobile and more numerous in China over the past 100 years, on average. But that is still something like a 1 in 20 chance that the worst coronavirus pandemic of the past hundred years happened in the exact 2-year window when gain of function research was happening most aggressively, and that is independent from the location coincidence.
Note this reasoning does not rely on the grant proposal.
“What kind of disease” has Bayes Factor 1.
You appear to have more knowledge of virology than I do, but this is far too implausible (on my model) for me to believe it merely because you declared it. I’ve heard of many plagues that were not bat coronaviruses. Your prior on the next naturally-occurring pandemic being a bat coronavirus cannot plausibly be ~100% unless you know some hitherto-unmentioned information that would be very startling to me.
This is a period of about 2 years out of the entire 1920-2020 hundred-year window. Now, we could probably discount that hundred year window down to say an equivalent of 40 years as people have become more mobile and more numerous in China over the past 100 years, on average.
I did see this, but didn’t find it convincing. China has become substantially more urban, more interconnected, more populous, and more connected to the outside world even over the past 10 or 20 years. A claim like this requires substantially more thorough analysis. And, again, is it reasonable to start researching and make COVID in the ~2 year time window given? Like suppose covid started 1 month after this moratorium was lifted, would we just say the probability is 2/1920?
You appear to have more knowledge of virology than I do, but this is far too implausible for me to believe it merely because you declared it. I’ve heard of many plagues that were not bat coronaviruses. Your prior on the next naturally-occurring pandemic being a bat coronavirus cannot plausibly be ~100% unless you know some hitherto-unmentioned information that would be very startling to me.
I think that whatever the next pandemic out of Southern or Central China or Southeast Asia is, the WIV (or some other lab in the region) is extremely likely to have a sample of a related virus and studied it. Sars-Cov-2, as the name might imply, is closely related to the Sars-Cov-1 1 pandemic of 20 years ago; as far as I know, the original reservoir animal of neither virus has been conclusively identified, although bats are the most obvious candidate.
Scientists have been identifying this region, specifically wet markets, as a likely source of viral pandemics, particularly from bat coronaviruses, for years. This is the exact region of the world they come from. I’m not an expert on virology, but the exact market in Wuhan where the first cases all cluster was identified as a likely place for a pandemic to start in 2014: https://www.nytimes.com/2022/03/23/health/wuhan-pandemic-edward-holmes.html
I’m somewhat surprised that you’re so skeptical of this; I don’t think anyone was ever in doubt that bat coronaviruses spilling into humans in Southeast Asia this part of China has been considered a likely problem for a long time.
I did see this, but didn’t find it convincing. China has become substantially more urban, more interconnected, more populous, and more connected to the outside world even over the past 10 or 20 years. A claim like this requires substantially more thorough analysis.
Your first comment seemed to take the position that the OP’s number was not merely different from yours, but indefensible, and you gave a lower bound for a defensible prior that was 1.4x higher than the number you were complaining about.
I feel like you have softened your position to the point where it no longer supports your original comment (from “timing is not even a consideration” to “this timing argument is less thorough than I think it ought to be”). If this is because you changed your mind, great! If not, then I’m confused about how these comments are meant to be squared.
Are you claiming the timing argument is so weak that no reasonable person could possibly estimate its Bayes factor as >1.4? I don’t feel like you’ve come close to justifying a claim like that.
And, again, is it reasonable to start researching and make COVID in the ~2 year time window given?
I have no idea! What’s your 90% CI for how long it would take them, and what evidence are you relying on for that?
I think that whatever the next pandemic out of Southern or Central China or Southeast Asia is, the WIV (or some other lab in the region) is extremely likely to have a sample of a related virus and studied it.
I previously thought you were claiming “the unconditional probability of a naturally-occurring pandemic to be a bat coronavirus is ~1”. This claim differs from that in several ways. Thank you for clarifying!
Making the probability conditional on location of origin: Absolutely fair, we already accounted for the improbability of the location. I missed this.
On the category of the disease we are matching: “bat coronavirus” may be too narrow (though I got that phrase from you), but “have a sample and have studied it” seems too broad. What’s your probability if we change that to “are currently performing gain-of-function research on it”?
(I also notice your claim is phrased such that it presumes any pandemic will be caused by a virus, but I’m assuming that was accidental and your claim generalizes to all vectors.)
I’m somewhat surprised that you’re so skeptical of this; I don’t think anyone was ever in doubt that bat coronaviruses spilling into humans in Southeast Asia this part of China has been considered a likely problem for a long time.
“This is likely to happen” and “there’s approximately a 100% chance that the very next problem in this general category will be this” are not the same, and are not close to being the same.
Your first comment seemed to take the position that the OP’s number was not merely different from yours, but indefensible, and you gave a lower bound for a defensible prior that was 1.4x higher than the number you were complaining about.
Are you claiming the timing argument is so weak that no reasonable person could possibly estimate its Bayes factor as >1.4? I don’t feel like you’ve come close to justifying a claim like that.
Roko gave a fairly high-level argument that didn’t dive too much into the details. I don’t believe it is possible for such an argument to reasonably give a probability of “at most 1/1000” that we see what he described with no lab leak. The location and type of disease make a great deal of sense for zoonosis and the timing factor is quite complicated—simply putting a uniform distribution over 80 years is not remotely valid.
It might be possible, with detailed argument and actual data, to come to the conclusion that the level of evidence from these factors gives a Bayes Factor of 1000 or more in favor of lab leak. I don’t think it’s likely, but it is at least complicated enough that I won’t say for sure it’s impossible.
I have no idea! What’s your 90% CI for how long it would take them, and what evidence are you relying on for that?
I don’t know. What I do know is that many relevant experts are skeptical that Covid could have been “created” in a lab, or if possible, think that it would have taken a very long time, and this does not seem to have changed from this old reddit post: https://www.reddit.com/r/science/comments/gk6y95/covid19_did_not_come_from_the_wuhan_institute_of/fqpc7c8/ to the recent Rootclaim debate. It would be theoretically possible that the WIV figured a bunch of things out that no one else knew, and kept them secret, but this of course has to be argued for and put a probability on before you can use it to generate any Bayes Factor at all. It’s the responsibility of the person saying that the timing provides strong evidence to demonstrate that.
I previously thought you were claiming “the unconditional probability of a naturally-occurring pandemic to be a bat coronavirus is ~1”. This claim differs from that in several ways. Thank you for clarifying!
Making the probability conditional on location of origin: Absolutely fair, we already accounted for the improbability of the location. I missed this.
Sorry for not being clearer on this. I can’t remember if I emphasized this above, but I don’t think the pieces of evidence that Roko mentions in this post are independent, so A) actually analyzing them is kind of hard, and B) you can’t just multiply the numbers together.
What’s your probability if we change that to “are currently performing gain-of-function research on it”?
I have no idea, probably pretty low. But that’s because we don’t actually know that the WIV is performing “gain of function research” or how closely related the viruses it worked on were to Covid. The closest viruses that we know the WIV had samples of are still thousands of mutations away. The only evidence in the post for what research might have been happening is the rejected grant proposal from 2018; it’s not actually clear if WIV did GOF research. See e.g. https://www.factcheck.org/2021/05/the-wuhan-lab-and-the-gain-of-function-disagreement/
(Arguably some genetic features of the virus lean toward lab leak, but this is highly debatable and would require substantial analysis to put any sort of number on; Roko barely mentions them, and I discussed the one thing that he does mention above).
The other thing to keep in mind, which I haven’t brought up yet, is that Wuhan is not the only place one could generate a similar lab-leak hypothesis for. Although it is home to one of only 2 BSL-4 labs in China (as far we know, at any rate), virology labs are spread across China, and over the course of the last several years, many of them have been asserted to potentially be related to the lab leak as well. One you start speculating that labs might be doing things that they haven’t made public, then you can consider the possibility that any lab anywhere might be involved. WIV might be one of the stronger coincidences, but it’s certainly not the only one. Doubly so when we have to speculate on what research they might have done and what viruses they might have had, as I mentioned above; if you can assert that the WIV could have been doing things that we don’t know about, well, you could say the same of any lab. And when evaluating evidence this way, you have to not only consider the exact set of facts you got, but all of the situations that you would evaluate similarly (sort of like how a p-value gives the probability of a result at least as extreme as the one you saw under the null). So while something like P(bat coronavirus starts in Wuhan) is at least some level of odd coincidence, a statement like
For the love of Bayes! How many times do you have to rerun history for a naturally occurring virus to randomly appear outside the lab that’s studying it at the exact time they are studying it?
taken literally, the probability might actually be close to one! I haven’t done a thorough review of what every virology lab in China and Southeast Asia is doing, but this is obviously a very large factor that is not considered at all when generating the “1:1000 against” claim.
I’m trying to get better at noticing when the topic of a conversation has drifted, so that I don’t unwittingly feel pressured to defend a position that is stronger, or broader, or just different, from what I was trying to say.
I was originally trying to say: When you said Roko’s number implied he thought people in Wuhan were less likely than the global average to be patient zero in a pandemic, I think that was an important misrepresentation of Roko’s actual argument.
I notice that we no longer seem to be discussing that, or anything that could plausibly change anyone’s opinion on that. So I’m going to stop here.
(I’m not necessarily claiming this is the first point in this conversation where I could have noticed this. Like I said, trying to get better.)
Ok. I don’t think that Roko necessarily thought of the situation that way; rather, I thought if it as a way to contextualize what a 1:1000 probability of a natural bat coronavirus pandemic starting in Wuhan meant.
The odds given in the OP are based on 3 coincidences:
Location
Timing
What kind of disease it was
Your number is only based on 1 of those coincidences (location). It is not surprising that the probability of one of those things is higher than the probability of all 3 at once.
“What kind of disease” has Bayes Factor 1. It’s exactly the kind of disease that has caused pandemics in the same region of the world within the past 20 years, and which comes from the kind of wild animal trade that has been known to be happening in Wuhan for years. I discussed this in the very next paragraph.
The timing is given by such a weak argument that I did ignore it, yes. WIV has been studying bat coronaviruses for years, and probably will continue to do so for years, and the only thing to tie it so closely in time is a rejected grant proposal that emphasized having the actual work done at UNC.
Actually, I think timing is actually evidence against the relevance of the grant proposal. I don’t think they could have done anything like creating Covid (which, based on everything I’ve heard, would have required vastly new and different techniques from what existed in 2018, and which is several thousand mutations away from the nearest known natural virus) in a year and a half.
This was in a section of the OP marked as an edit, so it’s possible this level of detail wasn’t there the first time you looked:
Note this reasoning does not rely on the grant proposal.
You appear to have more knowledge of virology than I do, but this is far too implausible (on my model) for me to believe it merely because you declared it. I’ve heard of many plagues that were not bat coronaviruses. Your prior on the next naturally-occurring pandemic being a bat coronavirus cannot plausibly be ~100% unless you know some hitherto-unmentioned information that would be very startling to me.
I did see this, but didn’t find it convincing. China has become substantially more urban, more interconnected, more populous, and more connected to the outside world even over the past 10 or 20 years. A claim like this requires substantially more thorough analysis. And, again, is it reasonable to start researching and make COVID in the ~2 year time window given? Like suppose covid started 1 month after this moratorium was lifted, would we just say the probability is 2/1920?
I think that whatever the next pandemic out of Southern or Central China or Southeast Asia is, the WIV (or some other lab in the region) is extremely likely to have a sample of a related virus and studied it. Sars-Cov-2, as the name might imply, is closely related to the Sars-Cov-1 1 pandemic of 20 years ago; as far as I know, the original reservoir animal of neither virus has been conclusively identified, although bats are the most obvious candidate.
Scientists have been identifying this region, specifically wet markets, as a likely source of viral pandemics, particularly from bat coronaviruses, for years. This is the exact region of the world they come from. I’m not an expert on virology, but the exact market in Wuhan where the first cases all cluster was identified as a likely place for a pandemic to start in 2014: https://www.nytimes.com/2022/03/23/health/wuhan-pandemic-edward-holmes.html
I’m somewhat surprised that you’re so skeptical of this; I don’t think anyone was ever in doubt that bat coronaviruses spilling into humans in Southeast Asia this part of China has been considered a likely problem for a long time.
Your first comment seemed to take the position that the OP’s number was not merely different from yours, but indefensible, and you gave a lower bound for a defensible prior that was 1.4x higher than the number you were complaining about.
I feel like you have softened your position to the point where it no longer supports your original comment (from “timing is not even a consideration” to “this timing argument is less thorough than I think it ought to be”). If this is because you changed your mind, great! If not, then I’m confused about how these comments are meant to be squared.
Are you claiming the timing argument is so weak that no reasonable person could possibly estimate its Bayes factor as >1.4? I don’t feel like you’ve come close to justifying a claim like that.
I have no idea! What’s your 90% CI for how long it would take them, and what evidence are you relying on for that?
I previously thought you were claiming “the unconditional probability of a naturally-occurring pandemic to be a bat coronavirus is ~1”. This claim differs from that in several ways. Thank you for clarifying!
Making the probability conditional on location of origin: Absolutely fair, we already accounted for the improbability of the location. I missed this.
On the category of the disease we are matching: “bat coronavirus” may be too narrow (though I got that phrase from you), but “have a sample and have studied it” seems too broad. What’s your probability if we change that to “are currently performing gain-of-function research on it”?
(I also notice your claim is phrased such that it presumes any pandemic will be caused by a virus, but I’m assuming that was accidental and your claim generalizes to all vectors.)
“This is likely to happen” and “there’s approximately a 100% chance that the very next problem in this general category will be this” are not the same, and are not close to being the same.
Roko gave a fairly high-level argument that didn’t dive too much into the details. I don’t believe it is possible for such an argument to reasonably give a probability of “at most 1/1000” that we see what he described with no lab leak. The location and type of disease make a great deal of sense for zoonosis and the timing factor is quite complicated—simply putting a uniform distribution over 80 years is not remotely valid.
It might be possible, with detailed argument and actual data, to come to the conclusion that the level of evidence from these factors gives a Bayes Factor of 1000 or more in favor of lab leak. I don’t think it’s likely, but it is at least complicated enough that I won’t say for sure it’s impossible.
I don’t know. What I do know is that many relevant experts are skeptical that Covid could have been “created” in a lab, or if possible, think that it would have taken a very long time, and this does not seem to have changed from this old reddit post: https://www.reddit.com/r/science/comments/gk6y95/covid19_did_not_come_from_the_wuhan_institute_of/fqpc7c8/ to the recent Rootclaim debate. It would be theoretically possible that the WIV figured a bunch of things out that no one else knew, and kept them secret, but this of course has to be argued for and put a probability on before you can use it to generate any Bayes Factor at all. It’s the responsibility of the person saying that the timing provides strong evidence to demonstrate that.
Sorry for not being clearer on this. I can’t remember if I emphasized this above, but I don’t think the pieces of evidence that Roko mentions in this post are independent, so A) actually analyzing them is kind of hard, and B) you can’t just multiply the numbers together.
I have no idea, probably pretty low. But that’s because we don’t actually know that the WIV is performing “gain of function research” or how closely related the viruses it worked on were to Covid. The closest viruses that we know the WIV had samples of are still thousands of mutations away. The only evidence in the post for what research might have been happening is the rejected grant proposal from 2018; it’s not actually clear if WIV did GOF research. See e.g. https://www.factcheck.org/2021/05/the-wuhan-lab-and-the-gain-of-function-disagreement/
(Arguably some genetic features of the virus lean toward lab leak, but this is highly debatable and would require substantial analysis to put any sort of number on; Roko barely mentions them, and I discussed the one thing that he does mention above).
The other thing to keep in mind, which I haven’t brought up yet, is that Wuhan is not the only place one could generate a similar lab-leak hypothesis for. Although it is home to one of only 2 BSL-4 labs in China (as far we know, at any rate), virology labs are spread across China, and over the course of the last several years, many of them have been asserted to potentially be related to the lab leak as well. One you start speculating that labs might be doing things that they haven’t made public, then you can consider the possibility that any lab anywhere might be involved. WIV might be one of the stronger coincidences, but it’s certainly not the only one. Doubly so when we have to speculate on what research they might have done and what viruses they might have had, as I mentioned above; if you can assert that the WIV could have been doing things that we don’t know about, well, you could say the same of any lab. And when evaluating evidence this way, you have to not only consider the exact set of facts you got, but all of the situations that you would evaluate similarly (sort of like how a p-value gives the probability of a result at least as extreme as the one you saw under the null). So while something like P(bat coronavirus starts in Wuhan) is at least some level of odd coincidence, a statement like
taken literally, the probability might actually be close to one! I haven’t done a thorough review of what every virology lab in China and Southeast Asia is doing, but this is obviously a very large factor that is not considered at all when generating the “1:1000 against” claim.
I’m trying to get better at noticing when the topic of a conversation has drifted, so that I don’t unwittingly feel pressured to defend a position that is stronger, or broader, or just different, from what I was trying to say.
I was originally trying to say: When you said Roko’s number implied he thought people in Wuhan were less likely than the global average to be patient zero in a pandemic, I think that was an important misrepresentation of Roko’s actual argument.
I notice that we no longer seem to be discussing that, or anything that could plausibly change anyone’s opinion on that. So I’m going to stop here.
(I’m not necessarily claiming this is the first point in this conversation where I could have noticed this. Like I said, trying to get better.)
Ok. I don’t think that Roko necessarily thought of the situation that way; rather, I thought if it as a way to contextualize what a 1:1000 probability of a natural bat coronavirus pandemic starting in Wuhan meant.
You heavily implied that Roko had assigned that probability to that event, and that implication is false.