people find it far easier to forgive others for being wrong than being right
Harry Potter and the Half-Blood Prince
First of all, I really appreciate this postmortem. Admitting times when you were wrong couldn’t have been an easy task, particularly if/when you staked a lot of your identity and reputation to being right. As EA and rationalist individuals and institutions become older and more professionalized, I’m guessing that institutional pressures will increasingly push us further and further away from wanting to admit mistakes; so I sincerely hope we get in the habit of publicly recognizing mistakes early on. (Unfinished list of my own mistakes, incidentally[1]). I hope to digest your post further and offer more insightful thoughts, but here are some initial thoughts:
Addendum on masks:
Another consideration about masks is that masks turn out in practice to be very reusable, a fact we (or at least I) should have investigated a lot more in early March.
On hospital-based transmission:
I don’t know how much you believed in it, but as presented, this appears to be merely (ha!) a forecasting error rather than a strategic error. In the absence of a clear counterfactual, I don’t think you were obviously wrong here, since it’s quite plausible that if a lot of people like you ignored/downplayed the role of hospital-based transmission, it’d have gotten a lot worse.
On being a jerk re Jim and Elizabeth’s post:
For what it’s worth, I also (privately) asked them to take it down because I had similar considerations to you and thought the thing they wrote about masks was unilateralist-y and a bit of an infohazard. I think I was wrong there. But I think I mostly was object-level wrong about the relative tradeoffs and harms. To the extent I updated now, a) I updated object-level on how much I should cooperate or desire others to cooperate with specific institutions, and b) I updated broadly (but not completely) in general favor of openness and against censorship.
I continue to maintain that if I (and possibly you) had the same object-level beliefs as before, it was not incorrect to consider it an info-hazard (but not all object-level info-hazards are worth suppressing! Particularly if release promotes the relevant meta-level norms more than it harms), though of course not an existential one.
On superforecasting:
You said you think superforecasting is
materially worse than [you] hoped it would be at noticing rare events early.
I don’t know how high your hopes were, but for what it’s worth, I think this proves too much. I’m not sure about the exact aggregation algorithms that the Open Phil Good Judgement covid-19 project was running, but I feel like all I can realistically gather was that “of this specific set of part-time superforecasters that were on the Open Phil-funded project, more than 50% of them were way too optimistic.”
While it’s certainly some evidence against superforecasters being good at noticing rare events early, I don’t think it’s sufficient evidence against superforecasters being able to do this, and I definitely don’t think this is a lot of evidence against superforecasting as a process.
As you weakly allude to, if you were on the project and paying attention more, you would probably have done better. Likewise, I know other superforecasters who I think were much more pessimistic than the GJ median. I suspect superforecasters who regularly read LessWrong and the EA Forum would have done better; and if I were to design a better system for superforecasting on rare events, I’d a) prime people to pay attention to a lot of rare events first, and b) have people train and score on log-loss or some other scoring system that’s more punishing of overconfidence than Brier.
(All that said, I think Metaculus did okay but not great on covid-y questions relative to what someone with high hopes for prediction aggregation algorithms might reasonably expect).
On US Gov’t Institutions:
I think there was a bunch of insights that your policy research experience has colored. For example, you mention how you trusted the FDA to have done a lot better under Scott Goettlieb. This might be obvious to you, but it’s something I didn’t even really think about until you highlighted this point. You also highlight a lot of useful specific uncertainties about whether the issue was political directors under Trump or nonpolitical directors of specific institutions. I think all of these things are very useful to know from the perspective of a policy researcher like yourself (and for students of US policy), since how to reform institutions is very decision-relevant to you and many other EAs.
That said, at a very coarse level, I think I’m a lot more cynical than you are implying with regard to how well US institutions would have handled this pre-Trump. It’s possible we’re not actually disagreeing, so I’m curious on your counterfactual probabilities on things being an order of magnitude better (<20,000 Americans dead of COVID-19 by now, say) in the following two worlds:
a) Clinton administration continuing all of Obama’s policies?
b) Clinton administration continuing all of Obama’s policies except for US CDC in China being equally understaffed as they are in our timeline.
My reasoning for why I’m generally pretty cynical (at least conditional upon this pandemic spreading at all, maybe a larger international presence could have helped contained it early) in those counterfactual worlds[2]:
1) There’s sort of an existing counterfactual for preparedness of governments with a broadly American/Western culture but as competent at governance as a typical European country. It’s called Europe. And I feel like every large geographically Western country was pretty bad at preparedness? People are praising Germany’s response, but when it comes down to it, Germany has 9000+ confirmed covid-19 deaths in a population of 83 million, or >100 deaths/million, despite taking a large economic hit to suppress the pandemic. Japan had <1000 confirmed deaths in a population of 126.5 million. Now Japan was bad at testing, so maybe Japan actually had ~4000 deaths. But even at those numbers (~31 deaths/million), Japan still had <1/3 the number of deaths per capita as Germany. And object-level, Japan seemed to have screwed up a bunch of important things, so there’s a simple transitivity argument where if a high-income country did worse than Japan, their policies/institutions couldn’t have been that great.
Maybe I’m harping on this too much, but I really don’t want us to succumb to the tyranny of low expectations here.
Now some culturally Western countries did fine (Australia, New Zealand). I’m not sure why they did well (maybe it’s because they’re islands, maybe seasonality is bigger than I think so Southern hemisphere had a huge initial advantage early on, maybe because they’re around 10-15% East Asian so people had enough ties to China to be worrying earlier, maybe low population density, maybe their institutions are newer and better, maybe just luck), but regardless, I’d counterfactually bet on the response of Hillary’s America looking more like a slightly less competent Europe or maybe Canada and less like Australia/NZ.
2) I didn’t look at it that much, but at the high-level, the US response to 2009 H1N1 looked more competent, but ultimately the response didn’t seem sufficient to have achieved containment if the mortality rates were as high as people thought it’d be? (Not sure of this, willing to be convinced otherwise on this one).
3) Some inside-view reasoning about specific actors.
___
Anyway, all these gripes aside, thank you again for your thoughtful (and well-written!) post. That couldn’t have been easy to write, and I really appreciate it.
[1] Your post actually me to thinking about how I should be more honest/introspective about my strategic and not just predictive mistakes, so thanks for that! I plan to update the list soon with some strategic mistakes as well. For example, I considered myself to be on the “right” side of masks epistemically but not strategically.
[2] I’m maybe 35% on a) and 30% on b). A lot of the probability mass is considerations on there being enough chaos/sensitivity to initial conditions that this pandemic maybe wouldn’t have happened at all, rather than Obama’s or Hillary’s response being an order of magnitude better conditional upon there being an epidemic.
There is a lot here to reply to, and I’m only going to address a few points.
First, on forecasting, I think there is a lot to discuss, and I think Johnwentsworth’s comment and my reply are all that I have to say about this for now.
Second, on Government response, I’m also unsure how much we disagree. I definitely think that I have a number of useful insights about institutions, but this is an area where expertise seems to be non-predictive. That means I’m less sure how valuable it is—but I discussed this in more depth here, on Ribbonfarm. That said, I’ll make comments anyways.
I agree that many countries were underprepared, but they also historically relied on American leadership for many of these types of events. America was the acknowledged world leader in biodefense and preparation, has spent more time and money on the problem than elsewhere, and has much more money and expertise than most places—so the failure is much more noteworthy than it otherwise would be.
I also think the EU “failures” should be counted as partial successes, since they mostly have case counts declining, and are well prepared to avoid the worst of a possible second wave. That’s a solid half credit in an absolute sense, since they seems poised to have gotten it under control before it ended up everywhere, though they didn’t catch it enough to prevent spread at first, which would have been the goal. The US (and to a lesser extent, the UK,) didn’t manage to control things enough to even get past the first wave, and they are poised to fail to herd immunity in most places—a shocking level of failure, especially given how well other countries have managed this.
For counterfactual predictions, on B, if the US did as well as Germany, Japan, France, and other G-7 nations, they would have kept deaths under 20,000, or at least around there. I’d give at least 50% to keeping it below 20k so far. (I’m unsure how bad the Republican Governors would have made this, or what the rest of the world looks like under Clinton. Would the Chinese have cooperated earlier? Counterfactual predictions this far back are basically about writing an alternative timeline—there are WAY too many potential issues to really consider well.) But the epidemic seems under control in the EU, contra the US. So that seems like the relevant counterfactual. (Aside: It seems non-coincidental, though a surprisingly strong effect, that right-wing populist leaders are especially bad at controlling infectious diseases—BoJo, Trump, and Putin all got this very, very wrong. I think the default reaction of trying to control the narrative over dealing with problems is a particularly dangerous approach with infectious diseases.) And for the A counterfactual, it’s similar, but with 20+% probability mass on “this was stopped enough before it left China that there was no pandemic.”
That European countries very much appear to have this under control
That they did much better than the US and Latin America
Right-wing populist leaders did worse than I expected, in a non-coincidental way (Brazil’s Bolsonaro is another example to add to the list).
“trying to control the narrative over dealing with problems is a particularly dangerous approach with infectious diseases” very strongly agreed. I’m a big fan of this write-up by NunoSempere, and this historian’s touching reflection on the Spanish flu.
I think it’s likely our disagreements are somewhat about framing than actual empirical differences. For example, “they seems poised to have gotten it under control before it ended up everywhere, though they didn’t catch it enough to prevent spread at first, which would have been the goal” is a phrase I’d use to describe South Korea and Singapore, not Western Europe, where almost every locale had community transmission. I’d use “they caught it enough to prevent spread” to describe places like Mongolia with zero or close to zero community transmission, or contained community transmission to a single region.
I agree that Western European governments should get a lot of relative credit for managing to prevent more deaths, disability, and wanton economic destruction, despite being in an initially bad spot. But thousands of people nonetheless died, and those deaths appeared to be largely preventable (in a practical, humanly doable sense). So while I think we should also a) emphasize the relative successes (because in these dark times it’s good to both hold on to hope and be grateful for what we have), and b) be unequivocally clear that the other Western governments mostly did better than the US, I do want to not lose sight of the target and also be clear that the relative failings of the US under Trump does not excuse the lesser failings of other institutions and governments.
Harry Potter and the Half-Blood Prince
First of all, I really appreciate this postmortem. Admitting times when you were wrong couldn’t have been an easy task, particularly if/when you staked a lot of your identity and reputation to being right. As EA and rationalist individuals and institutions become older and more professionalized, I’m guessing that institutional pressures will increasingly push us further and further away from wanting to admit mistakes; so I sincerely hope we get in the habit of publicly recognizing mistakes early on. (Unfinished list of my own mistakes, incidentally[1]). I hope to digest your post further and offer more insightful thoughts, but here are some initial thoughts:
Addendum on masks:
Another consideration about masks is that masks turn out in practice to be very reusable, a fact we (or at least I) should have investigated a lot more in early March.
On hospital-based transmission:
I don’t know how much you believed in it, but as presented, this appears to be merely (ha!) a forecasting error rather than a strategic error. In the absence of a clear counterfactual, I don’t think you were obviously wrong here, since it’s quite plausible that if a lot of people like you ignored/downplayed the role of hospital-based transmission, it’d have gotten a lot worse.
On being a jerk re Jim and Elizabeth’s post:
For what it’s worth, I also (privately) asked them to take it down because I had similar considerations to you and thought the thing they wrote about masks was unilateralist-y and a bit of an infohazard. I think I was wrong there. But I think I mostly was object-level wrong about the relative tradeoffs and harms. To the extent I updated now, a) I updated object-level on how much I should cooperate or desire others to cooperate with specific institutions, and b) I updated broadly (but not completely) in general favor of openness and against censorship.
I continue to maintain that if I (and possibly you) had the same object-level beliefs as before, it was not incorrect to consider it an info-hazard (but not all object-level info-hazards are worth suppressing! Particularly if release promotes the relevant meta-level norms more than it harms), though of course not an existential one.
On superforecasting:
You said you think superforecasting is
I don’t know how high your hopes were, but for what it’s worth, I think this proves too much. I’m not sure about the exact aggregation algorithms that the Open Phil Good Judgement covid-19 project was running, but I feel like all I can realistically gather was that “of this specific set of part-time superforecasters that were on the Open Phil-funded project, more than 50% of them were way too optimistic.”
While it’s certainly some evidence against superforecasters being good at noticing rare events early, I don’t think it’s sufficient evidence against superforecasters being able to do this, and I definitely don’t think this is a lot of evidence against superforecasting as a process.
As you weakly allude to, if you were on the project and paying attention more, you would probably have done better. Likewise, I know other superforecasters who I think were much more pessimistic than the GJ median. I suspect superforecasters who regularly read LessWrong and the EA Forum would have done better; and if I were to design a better system for superforecasting on rare events, I’d a) prime people to pay attention to a lot of rare events first, and b) have people train and score on log-loss or some other scoring system that’s more punishing of overconfidence than Brier.
(All that said, I think Metaculus did okay but not great on covid-y questions relative to what someone with high hopes for prediction aggregation algorithms might reasonably expect).
On US Gov’t Institutions:
I think there was a bunch of insights that your policy research experience has colored. For example, you mention how you trusted the FDA to have done a lot better under Scott Goettlieb. This might be obvious to you, but it’s something I didn’t even really think about until you highlighted this point. You also highlight a lot of useful specific uncertainties about whether the issue was political directors under Trump or nonpolitical directors of specific institutions. I think all of these things are very useful to know from the perspective of a policy researcher like yourself (and for students of US policy), since how to reform institutions is very decision-relevant to you and many other EAs.
That said, at a very coarse level, I think I’m a lot more cynical than you are implying with regard to how well US institutions would have handled this pre-Trump. It’s possible we’re not actually disagreeing, so I’m curious on your counterfactual probabilities on things being an order of magnitude better (<20,000 Americans dead of COVID-19 by now, say) in the following two worlds:
a) Clinton administration continuing all of Obama’s policies?
b) Clinton administration continuing all of Obama’s policies except for US CDC in China being equally understaffed as they are in our timeline.
My reasoning for why I’m generally pretty cynical (at least conditional upon this pandemic spreading at all, maybe a larger international presence could have helped contained it early) in those counterfactual worlds[2]:
1) There’s sort of an existing counterfactual for preparedness of governments with a broadly American/Western culture but as competent at governance as a typical European country. It’s called Europe. And I feel like every large geographically Western country was pretty bad at preparedness? People are praising Germany’s response, but when it comes down to it, Germany has 9000+ confirmed covid-19 deaths in a population of 83 million, or >100 deaths/million, despite taking a large economic hit to suppress the pandemic. Japan had <1000 confirmed deaths in a population of 126.5 million. Now Japan was bad at testing, so maybe Japan actually had ~4000 deaths. But even at those numbers (~31 deaths/million), Japan still had <1/3 the number of deaths per capita as Germany. And object-level, Japan seemed to have screwed up a bunch of important things, so there’s a simple transitivity argument where if a high-income country did worse than Japan, their policies/institutions couldn’t have been that great.
Maybe I’m harping on this too much, but I really don’t want us to succumb to the tyranny of low expectations here.
Now some culturally Western countries did fine (Australia, New Zealand). I’m not sure why they did well (maybe it’s because they’re islands, maybe seasonality is bigger than I think so Southern hemisphere had a huge initial advantage early on, maybe because they’re around 10-15% East Asian so people had enough ties to China to be worrying earlier, maybe low population density, maybe their institutions are newer and better, maybe just luck), but regardless, I’d counterfactually bet on the response of Hillary’s America looking more like a slightly less competent Europe or maybe Canada and less like Australia/NZ.
2) I didn’t look at it that much, but at the high-level, the US response to 2009 H1N1 looked more competent, but ultimately the response didn’t seem sufficient to have achieved containment if the mortality rates were as high as people thought it’d be? (Not sure of this, willing to be convinced otherwise on this one).
3) Some inside-view reasoning about specific actors.
___
Anyway, all these gripes aside, thank you again for your thoughtful (and well-written!) post. That couldn’t have been easy to write, and I really appreciate it.
[1] Your post actually me to thinking about how I should be more honest/introspective about my strategic and not just predictive mistakes, so thanks for that! I plan to update the list soon with some strategic mistakes as well. For example, I considered myself to be on the “right” side of masks epistemically but not strategically.
[2] I’m maybe 35% on a) and 30% on b). A lot of the probability mass is considerations on there being enough chaos/sensitivity to initial conditions that this pandemic maybe wouldn’t have happened at all, rather than Obama’s or Hillary’s response being an order of magnitude better conditional upon there being an epidemic.
There is a lot here to reply to, and I’m only going to address a few points.
First, on forecasting, I think there is a lot to discuss, and I think Johnwentsworth’s comment and my reply are all that I have to say about this for now.
Second, on Government response, I’m also unsure how much we disagree. I definitely think that I have a number of useful insights about institutions, but this is an area where expertise seems to be non-predictive. That means I’m less sure how valuable it is—but I discussed this in more depth here, on Ribbonfarm. That said, I’ll make comments anyways.
I agree that many countries were underprepared, but they also historically relied on American leadership for many of these types of events. America was the acknowledged world leader in biodefense and preparation, has spent more time and money on the problem than elsewhere, and has much more money and expertise than most places—so the failure is much more noteworthy than it otherwise would be.
I also think the EU “failures” should be counted as partial successes, since they mostly have case counts declining, and are well prepared to avoid the worst of a possible second wave. That’s a solid half credit in an absolute sense, since they seems poised to have gotten it under control before it ended up everywhere, though they didn’t catch it enough to prevent spread at first, which would have been the goal. The US (and to a lesser extent, the UK,) didn’t manage to control things enough to even get past the first wave, and they are poised to fail to herd immunity in most places—a shocking level of failure, especially given how well other countries have managed this.
For counterfactual predictions, on B, if the US did as well as Germany, Japan, France, and other G-7 nations, they would have kept deaths under 20,000, or at least around there. I’d give at least 50% to keeping it below 20k so far. (I’m unsure how bad the Republican Governors would have made this, or what the rest of the world looks like under Clinton. Would the Chinese have cooperated earlier? Counterfactual predictions this far back are basically about writing an alternative timeline—there are WAY too many potential issues to really consider well.) But the epidemic seems under control in the EU, contra the US. So that seems like the relevant counterfactual. (Aside: It seems non-coincidental, though a surprisingly strong effect, that right-wing populist leaders are especially bad at controlling infectious diseases—BoJo, Trump, and Putin all got this very, very wrong. I think the default reaction of trying to control the narrative over dealing with problems is a particularly dangerous approach with infectious diseases.) And for the A counterfactual, it’s similar, but with 20+% probability mass on “this was stopped enough before it left China that there was no pandemic.”
I agree with the following points:
That European countries very much appear to have this under control
That they did much better than the US and Latin America
Right-wing populist leaders did worse than I expected, in a non-coincidental way (Brazil’s Bolsonaro is another example to add to the list).
“trying to control the narrative over dealing with problems is a particularly dangerous approach with infectious diseases” very strongly agreed. I’m a big fan of this write-up by NunoSempere, and this historian’s touching reflection on the Spanish flu.
I think it’s likely our disagreements are somewhat about framing than actual empirical differences. For example, “they seems poised to have gotten it under control before it ended up everywhere, though they didn’t catch it enough to prevent spread at first, which would have been the goal” is a phrase I’d use to describe South Korea and Singapore, not Western Europe, where almost every locale had community transmission. I’d use “they caught it enough to prevent spread” to describe places like Mongolia with zero or close to zero community transmission, or contained community transmission to a single region.
I agree that Western European governments should get a lot of relative credit for managing to prevent more deaths, disability, and wanton economic destruction, despite being in an initially bad spot. But thousands of people nonetheless died, and those deaths appeared to be largely preventable (in a practical, humanly doable sense). So while I think we should also a) emphasize the relative successes (because in these dark times it’s good to both hold on to hope and be grateful for what we have), and b) be unequivocally clear that the other Western governments mostly did better than the US, I do want to not lose sight of the target and also be clear that the relative failings of the US under Trump does not excuse the lesser failings of other institutions and governments.