Benquo comments on Estimating COVID-19 Mortality Rates

Benquo 7 Jun 2020 18:26 UTC
2 points
IFR isn’t that helpful when trying to use public case data to estimate a hazard rate. I’ll add a note clarifying that in the post. Since what’s reported are cases, case fatalities are the natural thing to multiply the rate of new cases by.
Some apparently expert-promoted models have been total nonsense, and I prefer a back-of-the-envelope calculation whose flaws are obvious and easy for me to understand, to comparatively opaque sophisticated estimates which I can’t interpret.
Can you point me to a clear concise account that shows how to estimate IFR with available data and use it in a decision-relevant way?
- Bucky 7 Jun 2020 22:05 UTC
  13 points
  Parent
  The most detailed treatment I’ve seen on this is this from a couple of months ago.
  EDIT: To clarify per discussion below, I do think there’s a fair chance that given a lack of sufficient ventillators the IFR may be >1%.
  - Douglas_Knight 10 Jun 2020 3:58 UTC
    2 points
    Parent
    You say that like detail is a pure good. “Greg Cochran says 1.2%” is better than any number of words from CBG. Anyhow, you repudiated this. When I pushed you on it, you came up with the number 1.4%.
    - Bucky 10 Jun 2020 14:03 UTC
      2 points
      Parent
      I’m not confident in a 1% as an upper limit (especially in an overrun healthcare system) but I do think that comment gives good back-of-the-envelope estimates (as requested). Later on in that thread CBG also acknowledges it may be higher in than 1% in some places and conditions.
      Detail in this case is useful as it shows multiple sources and back-of-the-envelope calculations. I’m not really assessing CBG (except trusting that he isn’t picking and choosing his arguments), rather I’m assessing his back-of-the-envelope calculation and where likely errors can creep in—exactly what the great-grandparent mentioned was preferred.
      If “Greg Cochran says 1.2%” is the counter-argument then I don’t really know what to say except how likely is it that he’s wrong this time and by what factor might he be off? What’s his confidence interval? If someone can provide his working then at least that’s something I can assess. It seems he is looking specifically at places with high infection rates and more stretched healthcare systems.
      Anyhow, you repudiated this. When I pushed you on it, you came up with the number 1.4%.
      The naive central estimate of a single back-of-the-envelope estimate where virus prevalence in Lombardy was estimated from one small town from a month previous isn’t something I’d put much weight on. If pushed for an interquartile range based only on this calculation I would say 0.5<IFR<3.5. The point of that calculation wasn’t to get an accurate answer but to show that 0.2% population fatality rate doesn’t imply that the IFR is massive and 3,000,000 US coronavirus deaths this year is still highly unlikely.
      - Douglas_Knight 10 Jun 2020 21:37 UTC
        2 points
        Parent
        except trusting that he isn’t picking and choosing his arguments
        Well, don’t do that. I told you this before.
        What’s his confidence interval?
        What’s CBG’s confidence interval? When he says 0.5-1%, does he mean something? Does he mean a confidence interval, or a distribution of “normal” situations or a distribution of more general situations? Or does he not mean anything?
        Later on in that thread CBG also acknowledges it may be higher in than 1% in some places and conditions.
        It’s nice that he says that, but that’s exactly the situation that you cited him in the other thread, claiming <=1%. I’m guessing that the pseudo-detail is exactly what caused you to not understand his claims. If you don’t know what he claims, how can you assess his work? At least with GC you’re not fooling yourself about what you’ve done.
        And I still don’t know what he claims. He seems to claim that NYC had IFR <=1%. Was NYC normal or not? In any event he’s wrong. If NYC defines the upper range, then this affects his conclusion. If NYC doesn’t count, I dunno, but I’m pretty sure that people are equivocating on whether it counts.
        Bucky 11 Jun 2020 8:02 UTC
        2 points
        Parent
        I have edited the original comment to more fully reflect my position.
- jsteinhardt 7 Jun 2020 18:58 UTC
  7 points
  Parent
  The CFR will shift substantially over time and location as testing changes. I’m not sure how you would reliably use this information. IFR should not change much and tells you how bad it is for you personally to get sick.
  I wouldn’t call the model Zvi links expert-promoted. Every expert I talked to thought it had problems, and the people behind it are economists not epidemiologists or statisticians.
  For IFR you can start with seroprevalence data here and then work back from death rates: https://twitter.com/ScottGottliebMD/status/1268191059009581056
  Regarding back-of-the-envelope calculations, I think we have different approaches to evidence/data. I started with back-of-the-envelope calculations 3 months ago. But I would have based things on a variety of BOTECs and not a single one. Now I’ve found other sources that are taking the BOTEC and doing smarter stuff on top of it, so I mostly defer to those sources, or to experts with a good track record. This is easier for me because I’ve worked full-time on COVID for the past 3 months; if I weren’t in that position I’d probably combine some of my own BOTECs with opinions of people I trusted. In your case, I predict Zvi if you asked him would also say the IFR was in the range I gave.
  - Benquo 7 Jun 2020 19:35 UTC
    4 points
    Parent
    I clicked through to the tweet you mentioned, which contains a screencap of a chart purporting to show “An Approximate Percentage of the Population That Has COVID-19 Antibodies.” No dates or other info about how these numbers might have been generated.
    Fortunately, Gottlieb’s next tweet in the thread contains another screencap of the URLs of the studies mentioned in the chart. I hand-transcribed the Wuhan study URL, and found that while it was performed at a date that’s probably helpful (April 20th) it’s a study in a single hospital in Wuhan, and the abstract explicitly says it’s not a good population estimate:
    Here, we reported the positive rate of COVID‐19 tests based on NAT, chest CT scan and a serological SARS‐CoV‐2 test, from April 3 to 15 in one hospital in Qingshan Destrict, Wuhan. We observed a ~10% SARS‐CoV‐2‐specific IgG positive rate from 1,402 tests. Combination of SARS‐CoV‐2 NAT and a specific serological test might facilitate the detection of COVID‐19 infection, or the asymptomatic SARS‐CoV‐2‐infected subjects. Large‐scale investigation is required to evaluate the herd immunity of the city, for the resuming people and for the re‐opened city.
    I’d need to know more about e.g. hospitalization rates in Wuhan to interpret this.
    The New York numbers seem to come from a press release, with no clear info about how testing was conducted.
    All of these are point estimates, and to get ongoing infection rates, I’d need to fit a time series model with too many degrees of freedom. Not saying no one can do this, but definitely saying it’s not clear to me how I can make use of these numbers without working on the problem full time for a few weeks.
    You’ve nonspecifically referred to experts and models a few times; that’s not helpful and only serves to intimidate. What would be helpful would be if you could point to specific models by specific experts that make specific claims which you found helpful.
    - jsteinhardt 7 Jun 2020 20:14 UTC
      13 points
      Parent
      I’m not trying to intimidate; I’m trying to point out that I think you’re making errors that could be corrected by more research, which I hoped would be helpful. I’ve provided one link (which took me some time to dig up). If you don’t find this useful that’s fine, you’re not obligated to believe me and I’m not obligated to turn a LW comment into a lit review.
      - Benquo 7 Jun 2020 20:42 UTC
        6 points
        Parent
        Given that it apparently took you some time to dig up even as much as a tweet with a screen cap of some numbers that with quite a lot of additional investigation might be helpful, I hope you’re now at least less “confused” about why I am “relying on this back of the envelope rather than the pretty extensive body of work on this question.”
        If you want to see something better, show something better.
  - Benquo 9 Jul 2020 17:49 UTC
    2 points
    Parent
    The director of NIAID publicly endorsed that model’s bottom line.
  - Douglas_Knight 10 Jun 2020 3:58 UTC
    2 points
    Parent
    start with seroprevalence data
    Because of false positives, seroprevalence is massively overestimated everywhere that there hasn’t been a massive outbreak. In those places the IFR is 1-2%. But can we extrapolate to normal outbreaks? If, as widely believed, an overrun medical system has worse mortality, then maybe the normal IFR really is only 0.5-1%. But if your meta-analysis directly measures that, it is not well-done.
- jessicata 7 Jun 2020 22:50 UTC
  6 points
  Parent
  The intro paragraph seems to be talking about IFR (“around 2% of people who got COVID-19 would die”) and suggesting that “we have enough data to check”, i.e. that you’re estimating IFR and have good data on it.
  - Benquo 8 Jun 2020 1:53 UTC
    4 points
    Parent
    Good point, I should add a clarifying note.
- jsteinhardt 13 Jun 2020 8:00 UTC
  4 points
  Parent
  Here is a study that a colleague recommends: https://www.medrxiv.org/content/10.1101/2020.05.03.20089854v3. Tweet version: https://mobile.twitter.com/gidmk/status/1270171589170966529?s=21
  
  Their point estimate is 0.64% but with likely heterogeneity across settings.