The recent article by Steven Quay & Richard Muller in the Wall Street Journal attempts to bring the issue to a head by simplifying it down to two main points:
(1) The double CGG codons in the SARS2 furin cleavage site were deliberately designed by the 11 or 12 researchers who have created chimeric viruses as an unmistakable ‘marker’ for lab-made viruses so that you could always tell which future mutations evolved from a lab virus and which were naturally evolved. SARS2 has these tell-tale double CGG codons in its furin cleavage site, ergo it’s lab-made.
(2) Natural evolution, of the type displayed by SARS1 & MERS, involves a long series of “run-up” mutations (tries and fails) both in the bat & in the intermediary animals (palm civets, dromedary camels). They also had a similar series of immediate “follow-on” mutations in a race for “optimization” of infectivity once the virus broke out in humans. No evidence has been found for SARS2 displaying either the run-up or the follow-on, ergo it’s unlikely to be naturally evolved.
I would welcome hearing of competent commentary that directly refutes these two arguments. Maybe an actual gene-splitting researcher might say “Nah, we did it that way because it was easier, or because it was cheaper. We didn’t do it to create a marker.” Or maybe “We kept using the same codons as previous researchers had used merely in order to eliminate one factor of variability and make it easier to analyze our results.” Something like that.
Or, “The way SARS1 and MERS developed is only one of the possible ways for viruses to evolve. There are many other ways. That they didn’t have a run-up and follow-on of mutations is in no way indicative.”
If anyone finds articles addressing these arguments head-on, I would appreciate hearing about it.
Do you have a cite for previous work reporting or using this sequence (something like cct cgg cgg gca) for a cleavage site in viruses? I only ended up finding and looking through one bit of prior gain of function research that’s the sort of genetic engineering you’re hypothesizing ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168280/ ) but it used a totally different sequence. Better yet, someone from pre-covid-19 times talking about how they made their code include “cggcgg” as a marker.
Richard Muller, co-author with Steven Quay of the WSJ article, states in his interview with Sky News Australia (Scientific report suggests Wuhan lab leak as origin of COVID-19, YouTube, 10 June 2021, at 5:40 mark) that CGG was the spelling of arginine “most used in the laboratory” in lab-inserted furin cleavage sites and was in fact used by Shi Zhengli at the WIV, as she reported in one of her published papers.
But Steven Quay’s mammoth 193-page Bayesian Analysis of SARS-Cov-2 Origin (https://zenodo.org/record/4477081#.YMU0-S0ZNE4) puts the number at only a half of lab experiments and suggests additional reasons for the choice, in addition to tracking, which seems to be mentioned as merely another “additional advantage”. See the section entitled “Evidence. Laboratory codon optimization uses CGG for laboratory insertions of arginine residues 50% of the time.” (p. 90)
The interpretation of “marker” as a deliberate research strategy for distinguishing lab-made from naturally occurring viruses is my own, and may be overstating the explicit intentions of researchers. It is derived from Steven Quay’s SJW article, specifically this passage:
“Although the double CGG is suppressed naturally, the opposite is true in laboratory work. The insertion sequence of choice is the double CGG. That’s because it is readily available and convenient, and scientists have a great deal of experience inserting it. An additional advantage of the double CGG sequence compared with the other 35 possible choices: It creates a useful beacon that permits the scientists to track the insertion in the laboratory.”
Still, deliberate or incidental, the presence of a double-CGG in the furin cleavage site of COVID-19 weighs heavily on the lab origin side of the probability argument, since its 50% use in lab insertions contrasts strongly with a 0% probability (so far) of finding it anywhere in the entire genome of all other viruses in the sarbecovirus sub-class of betacoronaviruses that SARS1, MERS & SARS2 belong to—none of which, apart from SARS2, even has a furin cleavage site.
Whatever the intentions of researchers, is there another interpretation of the empirical data that would alter Steven Quay’s “beyond a reasonable doubt” conclusion that the virus came from a lab?
What other factors could be at play here to qualify further the results of his Bayesian analysis?
(1) The double CGG codons in the SARS2 furin cleavage site were deliberately designed by the 11 or 12 researchers who have created chimeric viruses as an unmistakable ‘marker’ for lab-made viruses so that you could always tell which future mutations evolved from a lab virus and which were naturally evolved. SARS2 has these tell-tale double CGG codons in its furin cleavage site, ergo it’s lab-made.
If you wanted strong tracking why would you only do it once and not a few times so it’s more stable?
It’s because there is only one single place in the genome that you really want to track: the furin cleavage site (FCS). I assumed the wrong reason for using double CGG.
It’s not to distinguish natural from lab-made viruses (although it does do that).
It’s so that you can have a test inthelab for whether the FCS you have inserted is working or not. It’s so that you can “check your work”.
The unique spelling with double CGG is the only one out of the 36 possible configurations of arginine (the “R” in the “PRRA” FCS insertion) that allows you to track whether the cleavage you are trying to engineer has happened.
Steven Quay explains this at the 59:00 mark of his interview with Julius KIllerby, which is well worth listening to in its entirety, as it explains the odds of a lab leak vs. natural evolution, based on undisputed facts.
I’m not remotely qualified to comment on this, but fwiw in the Mojiang Mine Theory (which says it was a lab leak, but did not involve GOF), six miners caught the virus from bats (and/or each other), and then the virus spent four months replicating within the body of one of these poor guys as he lay sick in a hospital (and then of course samples were sent to WIV and put in storage).
This would explain (2) because four months in this guy’s body (especially lungs) allows tons of opportunity for the virus to evolve and mutate and recombine in order to adapt to the human body, and maybe it also explains (1) either randomly or via recombination between viral and human DNA (if that makes sense?), again during those four months in this poor guy’s body.
It seems like an interesting hypothesis but I don’t think it’s particularly likely. I’ve never heard of other viruses becoming well adapted to humans within a single host. Though, I do think that’s the explanation for how several variants evolved (since some of them emerged with a bunch of functional mutations rather than just one or two). I’d be interest to see more research into the evolution of viruses within human hosts, and what degree of change is possible & how this relates to spillover events.
The recent article by Steven Quay & Richard Muller in the Wall Street Journal attempts to bring the issue to a head by simplifying it down to two main points:
(1) The double CGG codons in the SARS2 furin cleavage site were deliberately designed by the 11 or 12 researchers who have created chimeric viruses as an unmistakable ‘marker’ for lab-made viruses so that you could always tell which future mutations evolved from a lab virus and which were naturally evolved. SARS2 has these tell-tale double CGG codons in its furin cleavage site, ergo it’s lab-made.
(2) Natural evolution, of the type displayed by SARS1 & MERS, involves a long series of “run-up” mutations (tries and fails) both in the bat & in the intermediary animals (palm civets, dromedary camels). They also had a similar series of immediate “follow-on” mutations in a race for “optimization” of infectivity once the virus broke out in humans. No evidence has been found for SARS2 displaying either the run-up or the follow-on, ergo it’s unlikely to be naturally evolved.
I would welcome hearing of competent commentary that directly refutes these two arguments. Maybe an actual gene-splitting researcher might say “Nah, we did it that way because it was easier, or because it was cheaper. We didn’t do it to create a marker.” Or maybe “We kept using the same codons as previous researchers had used merely in order to eliminate one factor of variability and make it easier to analyze our results.” Something like that.
Or, “The way SARS1 and MERS developed is only one of the possible ways for viruses to evolve. There are many other ways. That they didn’t have a run-up and follow-on of mutations is in no way indicative.”
If anyone finds articles addressing these arguments head-on, I would appreciate hearing about it.
Re 1) the codons, according to Christian Drosten, have precedence for evolving naturally in viruses. That could be because viruses evolve much faster than e.g. animals. Source: search for ‘codon’ and use translate here: https://www.ndr.de/nachrichten/info/92-Coronavirus-Update-Woher-stammt-das-Virus,podcastcoronavirus322.html
The link also has a bunch of content about the evolution of furin cleavage sites, from a leading expert.
Do you have a cite for previous work reporting or using this sequence (something like cct cgg cgg gca) for a cleavage site in viruses? I only ended up finding and looking through one bit of prior gain of function research that’s the sort of genetic engineering you’re hypothesizing ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168280/ ) but it used a totally different sequence. Better yet, someone from pre-covid-19 times talking about how they made their code include “cggcgg” as a marker.
Richard Muller, co-author with Steven Quay of the WSJ article, states in his interview with Sky News Australia (Scientific report suggests Wuhan lab leak as origin of COVID-19, YouTube, 10 June 2021, at 5:40 mark) that CGG was the spelling of arginine “most used in the laboratory” in lab-inserted furin cleavage sites and was in fact used by Shi Zhengli at the WIV, as she reported in one of her published papers.
But Steven Quay’s mammoth 193-page Bayesian Analysis of SARS-Cov-2 Origin (https://zenodo.org/record/4477081#.YMU0-S0ZNE4) puts the number at only a half of lab experiments and suggests additional reasons for the choice, in addition to tracking, which seems to be mentioned as merely another “additional advantage”. See the section entitled “Evidence. Laboratory codon optimization uses CGG for laboratory insertions of arginine residues 50% of the time.” (p. 90)
The interpretation of “marker” as a deliberate research strategy for distinguishing lab-made from naturally occurring viruses is my own, and may be overstating the explicit intentions of researchers. It is derived from Steven Quay’s SJW article, specifically this passage:
“Although the double CGG is suppressed naturally, the opposite is true in laboratory work. The insertion sequence of choice is the double CGG. That’s because it is readily available and convenient, and scientists have a great deal of experience inserting it. An additional advantage of the double CGG sequence compared with the other 35 possible choices: It creates a useful beacon that permits the scientists to track the insertion in the laboratory.”
Still, deliberate or incidental, the presence of a double-CGG in the furin cleavage site of COVID-19 weighs heavily on the lab origin side of the probability argument, since its 50% use in lab insertions contrasts strongly with a 0% probability (so far) of finding it anywhere in the entire genome of all other viruses in the sarbecovirus sub-class of betacoronaviruses that SARS1, MERS & SARS2 belong to—none of which, apart from SARS2, even has a furin cleavage site.
Whatever the intentions of researchers, is there another interpretation of the empirical data that would alter Steven Quay’s “beyond a reasonable doubt” conclusion that the virus came from a lab?
What other factors could be at play here to qualify further the results of his Bayesian analysis?
If you wanted strong tracking why would you only do it once and not a few times so it’s more stable?
It’s because there is only one single place in the genome that you really want to track: the furin cleavage site (FCS). I assumed the wrong reason for using double CGG.
It’s not to distinguish natural from lab-made viruses (although it does do that).
It’s so that you can have a test in the lab for whether the FCS you have inserted is working or not. It’s so that you can “check your work”.
The unique spelling with double CGG is the only one out of the 36 possible configurations of arginine (the “R” in the “PRRA” FCS insertion) that allows you to track whether the cleavage you are trying to engineer has happened.
Steven Quay explains this at the 59:00 mark of his interview with Julius KIllerby, which is well worth listening to in its entirety, as it explains the odds of a lab leak vs. natural evolution, based on undisputed facts.
I’m not remotely qualified to comment on this, but fwiw in the Mojiang Mine Theory (which says it was a lab leak, but did not involve GOF), six miners caught the virus from bats (and/or each other), and then the virus spent four months replicating within the body of one of these poor guys as he lay sick in a hospital (and then of course samples were sent to WIV and put in storage).
This would explain (2) because four months in this guy’s body (especially lungs) allows tons of opportunity for the virus to evolve and mutate and recombine in order to adapt to the human body, and maybe it also explains (1) either randomly or via recombination between viral and human DNA (if that makes sense?), again during those four months in this poor guy’s body.
It seems like an interesting hypothesis but I don’t think it’s particularly likely. I’ve never heard of other viruses becoming well adapted to humans within a single host. Though, I do think that’s the explanation for how several variants evolved (since some of them emerged with a bunch of functional mutations rather than just one or two). I’d be interest to see more research into the evolution of viruses within human hosts, and what degree of change is possible & how this relates to spillover events.