Ben, I think you’re failing to account for under-testing. You’re computing the case fatality rate when you want the infection fatality rate. Most experts, as well as the well-done meta analyses, place the IFR in the 0.5%-1% range. I’m a little bit confused why you’re relying on this back of the envelope rather than the pretty extensive body of work on this question.
IFR isn’t that helpful when trying to use public case data to estimate a hazard rate. I’ll add a note clarifying that in the post. Since what’s reported are cases, case fatalities are the natural thing to multiply the rate of new cases by.
Some apparently expert-promoted models have been total nonsense, and I prefer a back-of-the-envelope calculation whose flaws are obvious and easy for me to understand, to comparatively opaque sophisticated estimates which I can’t interpret.
Can you point me to a clear concise account that shows how to estimate IFR with available data and use it in a decision-relevant way?
You say that like detail is a pure good. “Greg Cochran says 1.2%” is better than any number of words from CBG. Anyhow, you repudiated this. When I pushed you on it, you came up with the number 1.4%.
I’m not confident in a 1% as an upper limit (especially in an overrun healthcare system) but I do think that comment gives good back-of-the-envelope estimates (as requested). Later on in that thread CBG also acknowledges it may be higher in than 1% in some places and conditions.
Detail in this case is useful as it shows multiple sources and back-of-the-envelope calculations. I’m not really assessing CBG (except trusting that he isn’t picking and choosing his arguments), rather I’m assessing his back-of-the-envelope calculation and where likely errors can creep in—exactly what the great-grandparent mentioned was preferred.
If “Greg Cochran says 1.2%” is the counter-argument then I don’t really know what to say except how likely is it that he’s wrong this time and by what factor might he be off? What’s his confidence interval? If someone can provide his working then at least that’s something I can assess. It seems he is looking specifically at places with high infection rates and more stretched healthcare systems.
Anyhow, you repudiated this. When I pushed you on it, you came up with the number 1.4%.
The naive central estimate of a single back-of-the-envelope estimate where virus prevalence in Lombardy was estimated from one small town from a month previous isn’t something I’d put much weight on. If pushed for an interquartile range based only on this calculation I would say 0.5<IFR<3.5. The point of that calculation wasn’t to get an accurate answer but to show that 0.2% population fatality rate doesn’t imply that the IFR is massive and 3,000,000 US coronavirus deaths this year is still highly unlikely.
What’s CBG’s confidence interval? When he says 0.5-1%, does he mean something? Does he mean a confidence interval, or a distribution of “normal” situations or a distribution of more general situations? Or does he not mean anything?
Later on in that thread CBG also acknowledges it may be higher in than 1% in some places and conditions.
It’s nice that he says that, but that’s exactly the situation that you cited him in the other thread, claiming <=1%. I’m guessing that the pseudo-detail is exactly what caused you to not understand his claims. If you don’t know what he claims, how can you assess his work? At least with GC you’re not fooling yourself about what you’ve done.
And I still don’t know what he claims. He seems to claim that NYC had IFR <=1%. Was NYC normal or not? In any event he’s wrong. If NYC defines the upper range, then this affects his conclusion. If NYC doesn’t count, I dunno, but I’m pretty sure that people are equivocating on whether it counts.
The CFR will shift substantially over time and location as testing changes. I’m not sure how you would reliably use this information. IFR should not change much and tells you how bad it is for you personally to get sick.
I wouldn’t call the model Zvi links expert-promoted. Every expert I talked to thought it had problems, and the people behind it are economists not epidemiologists or statisticians.
Regarding back-of-the-envelope calculations, I think we have different approaches to evidence/data. I started with back-of-the-envelope calculations 3 months ago. But I would have based things on a variety of BOTECs and not a single one. Now I’ve found other sources that are taking the BOTEC and doing smarter stuff on top of it, so I mostly defer to those sources, or to experts with a good track record. This is easier for me because I’ve worked full-time on COVID for the past 3 months; if I weren’t in that position I’d probably combine some of my own BOTECs with opinions of people I trusted. In your case, I predict Zvi if you asked him would also say the IFR was in the range I gave.
I clicked through to the tweet you mentioned, which contains a screencap of a chart purporting to show “An Approximate Percentage of the Population That Has COVID-19 Antibodies.” No dates or other info about how these numbers might have been generated.
Fortunately, Gottlieb’s next tweet in the thread contains another screencap of the URLs of the studies mentioned in the chart. I hand-transcribed the Wuhan study URL, and found that while it was performed at a date that’s probably helpful (April 20th) it’s a study in a single hospital in Wuhan, and the abstract explicitly says it’s not a good population estimate:
Here, we reported the positive rate of COVID‐19 tests based on NAT, chest CT scan and a serological SARS‐CoV‐2 test, from April 3 to 15 in one hospital in Qingshan Destrict, Wuhan. We observed a ~10% SARS‐CoV‐2‐specific IgG positive rate from 1,402 tests. Combination of SARS‐CoV‐2 NAT and a specific serological test might facilitate the detection of COVID‐19 infection, or the asymptomatic SARS‐CoV‐2‐infected subjects. Large‐scale investigation is required to evaluate the herd immunity of the city, for the resuming people and for the re‐opened city.
I’d need to know more about e.g. hospitalization rates in Wuhan to interpret this.
The New York numbers seem to come from a press release, with no clear info about how testing was conducted.
All of these are point estimates, and to get ongoing infection rates, I’d need to fit a time series model with too many degrees of freedom. Not saying no one can do this, but definitely saying it’s not clear to me how I can make use of these numbers without working on the problem full time for a few weeks.
You’ve nonspecifically referred to experts and models a few times; that’s not helpful and only serves to intimidate. What would be helpful would be if you could point to specific models by specific experts that make specific claims which you found helpful.
I’m not trying to intimidate; I’m trying to point out that I think you’re making errors that could be corrected by more research, which I hoped would be helpful. I’ve provided one link (which took me some time to dig up). If you don’t find this useful that’s fine, you’re not obligated to believe me and I’m not obligated to turn a LW comment into a lit review.
Given that it apparently took you some time to dig up even as much as a tweet with a screen cap of some numbers that with quite a lot of additional investigation might be helpful, I hope you’re now at least less “confused” about why I am “relying on this back of the envelope rather than the pretty extensive body of work on this question.”
If you want to see something better, show something better.
Because of false positives, seroprevalence is massively overestimated everywhere that there hasn’t been a massive outbreak. In those places the IFR is 1-2%. But can we extrapolate to normal outbreaks? If, as widely believed, an overrun medical system has worse mortality, then maybe the normal IFR really is only 0.5-1%. But if your meta-analysis directly measures that, it is not well-done.
The intro paragraph seems to be talking about IFR (“around 2% of people who got COVID-19 would die”) and suggesting that “we have enough data to check”, i.e. that you’re estimating IFR and have good data on it.
Ben, I think you’re failing to account for under-testing. You’re computing the case fatality rate when you want the infection fatality rate. Most experts, as well as the well-done meta analyses, place the IFR in the 0.5%-1% range. I’m a little bit confused why you’re relying on this back of the envelope rather than the pretty extensive body of work on this question.
IFR isn’t that helpful when trying to use public case data to estimate a hazard rate. I’ll add a note clarifying that in the post. Since what’s reported are cases, case fatalities are the natural thing to multiply the rate of new cases by.
Some apparently expert-promoted models have been total nonsense, and I prefer a back-of-the-envelope calculation whose flaws are obvious and easy for me to understand, to comparatively opaque sophisticated estimates which I can’t interpret.
Can you point me to a clear concise account that shows how to estimate IFR with available data and use it in a decision-relevant way?
The most detailed treatment I’ve seen on this is this from a couple of months ago.
EDIT: To clarify per discussion below, I do think there’s a fair chance that given a lack of sufficient ventillators the IFR may be >1%.
You say that like detail is a pure good. “Greg Cochran says 1.2%” is better than any number of words from CBG. Anyhow, you repudiated this. When I pushed you on it, you came up with the number 1.4%.
I’m not confident in a 1% as an upper limit (especially in an overrun healthcare system) but I do think that comment gives good back-of-the-envelope estimates (as requested). Later on in that thread CBG also acknowledges it may be higher in than 1% in some places and conditions.
Detail in this case is useful as it shows multiple sources and back-of-the-envelope calculations. I’m not really assessing CBG (except trusting that he isn’t picking and choosing his arguments), rather I’m assessing his back-of-the-envelope calculation and where likely errors can creep in—exactly what the great-grandparent mentioned was preferred.
If “Greg Cochran says 1.2%” is the counter-argument then I don’t really know what to say except how likely is it that he’s wrong this time and by what factor might he be off? What’s his confidence interval? If someone can provide his working then at least that’s something I can assess. It seems he is looking specifically at places with high infection rates and more stretched healthcare systems.
The naive central estimate of a single back-of-the-envelope estimate where virus prevalence in Lombardy was estimated from one small town from a month previous isn’t something I’d put much weight on. If pushed for an interquartile range based only on this calculation I would say 0.5<IFR<3.5. The point of that calculation wasn’t to get an accurate answer but to show that 0.2% population fatality rate doesn’t imply that the IFR is massive and 3,000,000 US coronavirus deaths this year is still highly unlikely.
Well, don’t do that. I told you this before.
What’s CBG’s confidence interval? When he says 0.5-1%, does he mean something? Does he mean a confidence interval, or a distribution of “normal” situations or a distribution of more general situations? Or does he not mean anything?
It’s nice that he says that, but that’s exactly the situation that you cited him in the other thread, claiming <=1%. I’m guessing that the pseudo-detail is exactly what caused you to not understand his claims. If you don’t know what he claims, how can you assess his work? At least with GC you’re not fooling yourself about what you’ve done.
And I still don’t know what he claims. He seems to claim that NYC had IFR <=1%. Was NYC normal or not? In any event he’s wrong. If NYC defines the upper range, then this affects his conclusion. If NYC doesn’t count, I dunno, but I’m pretty sure that people are equivocating on whether it counts.
I have edited the original comment to more fully reflect my position.
The CFR will shift substantially over time and location as testing changes. I’m not sure how you would reliably use this information. IFR should not change much and tells you how bad it is for you personally to get sick.
I wouldn’t call the model Zvi links expert-promoted. Every expert I talked to thought it had problems, and the people behind it are economists not epidemiologists or statisticians.
For IFR you can start with seroprevalence data here and then work back from death rates: https://twitter.com/ScottGottliebMD/status/1268191059009581056
Regarding back-of-the-envelope calculations, I think we have different approaches to evidence/data. I started with back-of-the-envelope calculations 3 months ago. But I would have based things on a variety of BOTECs and not a single one. Now I’ve found other sources that are taking the BOTEC and doing smarter stuff on top of it, so I mostly defer to those sources, or to experts with a good track record. This is easier for me because I’ve worked full-time on COVID for the past 3 months; if I weren’t in that position I’d probably combine some of my own BOTECs with opinions of people I trusted. In your case, I predict Zvi if you asked him would also say the IFR was in the range I gave.
I clicked through to the tweet you mentioned, which contains a screencap of a chart purporting to show “An Approximate Percentage of the Population That Has COVID-19 Antibodies.” No dates or other info about how these numbers might have been generated.
Fortunately, Gottlieb’s next tweet in the thread contains another screencap of the URLs of the studies mentioned in the chart. I hand-transcribed the Wuhan study URL, and found that while it was performed at a date that’s probably helpful (April 20th) it’s a study in a single hospital in Wuhan, and the abstract explicitly says it’s not a good population estimate:
I’d need to know more about e.g. hospitalization rates in Wuhan to interpret this.
The New York numbers seem to come from a press release, with no clear info about how testing was conducted.
All of these are point estimates, and to get ongoing infection rates, I’d need to fit a time series model with too many degrees of freedom. Not saying no one can do this, but definitely saying it’s not clear to me how I can make use of these numbers without working on the problem full time for a few weeks.
You’ve nonspecifically referred to experts and models a few times; that’s not helpful and only serves to intimidate. What would be helpful would be if you could point to specific models by specific experts that make specific claims which you found helpful.
I’m not trying to intimidate; I’m trying to point out that I think you’re making errors that could be corrected by more research, which I hoped would be helpful. I’ve provided one link (which took me some time to dig up). If you don’t find this useful that’s fine, you’re not obligated to believe me and I’m not obligated to turn a LW comment into a lit review.
Given that it apparently took you some time to dig up even as much as a tweet with a screen cap of some numbers that with quite a lot of additional investigation might be helpful, I hope you’re now at least less “confused” about why I am “relying on this back of the envelope rather than the pretty extensive body of work on this question.”
If you want to see something better, show something better.
The director of NIAID publicly endorsed that model’s bottom line.
Because of false positives, seroprevalence is massively overestimated everywhere that there hasn’t been a massive outbreak. In those places the IFR is 1-2%. But can we extrapolate to normal outbreaks? If, as widely believed, an overrun medical system has worse mortality, then maybe the normal IFR really is only 0.5-1%. But if your meta-analysis directly measures that, it is not well-done.
The intro paragraph seems to be talking about IFR (“around 2% of people who got COVID-19 would die”) and suggesting that “we have enough data to check”, i.e. that you’re estimating IFR and have good data on it.
Good point, I should add a clarifying note.
Here is a study that a colleague recommends: https://www.medrxiv.org/content/10.1101/2020.05.03.20089854v3. Tweet version: https://mobile.twitter.com/gidmk/status/1270171589170966529?s=21
Their point estimate is 0.64% but with likely heterogeneity across settings.