Coronavirus Virology: A Beginner’s Guide
Introduction
This is aimed at those interested in a biological understanding of the basic features of a coronavirus, but who do not have a biological/chemical background to speak of—if you haven’t done biology or chemistry since high school, this should be for you. It starts at square one and it oversimplifies many things. That said, I hope it is at least clear, and allows you to read scientific papers on the features of SARS-CoV-2 (novel coronavirus) without getting too much of a headache.
Viruses mostly just use their host’s (that’s you) metabolism to make more viruses. To understand how that occurs, we will begin with some basic (host) cell physiology—how DNA, RNA, and proteins relate to one another, and what a lipid bilayer is—before moving on to look at the anatomy of a coronavirus, and then finally a handful of specific features that make SARS-CoV-2 so dangerous.
DISCLAIMER: There should be nothing controversial or disputed here, and I have specifically avoided giving information that is contested. This is not least because I am a medical student, not a virologist or epidemiologist. The aim of this guide is just to introduce the basic science, nothing more.
Many thanks to Jaden Kimura for comments and proof-reading.
How does a cell work, anyway?
In your own cells, DNA codes for all of the proteins you produce, which determines how you build cells and, eventually, a body. Think of it as the source code. Short stretches of it are ‘compiled’ (transcribed) onto RNA, which is a very similar but less stable molecule. Only RNA can be actually ‘run’ by translating it into functional proteins.
DNA is double-stranded, but the two strands are not the same—they’re kind of chemical mirror images of one another. One is ‘positive sense’ and the other ‘negative sense’. Because of the way that they mirror one another, they bind tightly together. When new RNA is made, it is ‘copied’, through a similar binding, from the negative sense strand—it looks just like the positive sense strand. This means that your protein-making machinery (ribosomes) is all used to reading positive sense RNA.
The outside ‘wall’ of a cell is made of a lipid bilayer. The bilayer is made up of molecules with a small polar ‘head’ and a long non-polar tail. Because water is very polar (the electrons are clustered near the oxygen nucleus, making the hydrogens slightly positively charged), the polar ‘head’ can form very stable relationships with water—it is ‘hydrophilic’. As the opposite is true of the very non-polar part of the molecule (lipophilic/hydrophobic tail), the molecules arrange themselves to stable states where the tails are touching one another, and the heads are touching the water—a bilayered sphere. Into this bilayer, different proteins can be ‘inserted’ if they have a non-polar section (or sections) and polar regions such that they stably localise inside or through the lipid bilayer.
How does a coronavirus work?
Coronaviruses are positive sense, RNA, enveloped viruses. Hopefully you recognise some of those terms, but we’ll just highlight the consequences of this particular structure.
The genetic material in a coronavirus consists only of positive sense RNA. It can therefore be shoved directly into your ribosomes to produce coronavirus proteins, without ever having to be modified by enzymes. This means that coronavirus RNA on its own is theoretically infectious—if you injected it into human cells, it would use your ribosomes to assemble new virus particles.
In some viruses, the genetic material is basically the entire virus, packaged into some proteins (a nucleocapsid). Coronaviruses, however, also have an external envelope surrounding them. This envelope is made up of host cell membrane—a fatty lipid bilayer, just like your own cells, but with a couple of viral proteins inserted into it.
Because the bilayer relies on the polar/non-polar relations to remain stable, molecules that are similarly structured—with a polar and non-polar area (amphiphilic molecules) -- can disrupt the stability of this arrangement at sufficiently high concentrations. They insert into the membrane and pull the lipids apart into much smaller, now more stable, arrangements. Without a lipid bilayer, the viral particle cannot insert itself into cells effectively, and is inactive. Alcohol and soap are precisely such molecules, which is why they are so effective against coronaviruses.
The final feature worth noting about the coronavirus is the spike protein. This is a protein which binds to a host protein (on the cell surface) and causes the envelope to fuse with the cell surface membrane, releasing the nucleocapsid into the cytoplasm (the inside of the cell). Because different species, and different cell lines in a given species, express different cell surface proteins, this leads to the majority of both species specificity and ‘type-of-infection’ specificity—only some viruses are capable of causing respiratory infections, to take a completely random example.
Why is SARS-CoV-2 so dangerous?
There are essentially only two things that make a virus dangerous: how infectious it is, and how serious it is once you have it. The first is roughly described by the R0 (basic reproduction rate), and the second by the infection fatality rate.
The first major point is just how bluntly infectious SARS-CoV-2 is. The R0 is a measure of, on average, how many people an infected person will go on to infect (in a population where immunity is negligible). Seasonal flu has an R0 of ~1.5 – each person infects, on average, 1-2 further people. MERS’ R0 was just ~0.7—less than one person was infected per person, meaning it isn’t really capable of spreading among humans. Current estimates of SARS-CoV-2 R0 vary a lot, but it looks like it’s between 3 and 5. Exponential growth means that the difference between 1.5 and 3 is phenomenal—over only 5 ‘generations’, a virus with an R0 of 3 will have infected ~245 people; a virus with R0=1.5 only ~8.
However, this clearly isn’t the whole story—SARS had a similar R0, and only infected around 8,000 people worldwide. Why? SARS cannot be transmitted from people who are asymptomatic. This means that isolation of people with symptoms is highly effective—following containment measures, the effective R0 is thought to have dropped to around 0.4. However, as has been discussed already on LW, this doesn’t seem to be true of SARS-CoV-2. Firstly, it seems that people are infectious for as much as 48 hours in advance of any symptoms. Secondly, it’s likely (though, for obvious reasons, hard to prove) that a significant proportion of COVID-19 cases are either totally or mostly asymptomatic, particularly in children. Combined with a relatively long incubation period (averaging 5 days but likely up to 13 or 14 in the young), this means that by the time cases are presenting to hospitals in large numbers, the disease is already running rampant in the general population.
The other half of the equation—severity of disease—is fairly self-evident, not to mention well-discussed. Note that the Case fatality Rate (CFR) measures the % of those formally diagnosed with the disease who die. As it seems a significant number of those infected are asymptomatic (and therefore never diagnosed), the actually important number is the Infection fatality rate (IFR), which counts all those infected. Unfortunately this is almost impossible to assess accurately except in specific circumstances (like where you have enough tests to do regular whole-cohort testing, to catch the asymptomatic cases), so CFR is used more commonly.
Current estimates of the CFR are around 2-3%. This is much lower than, for instance, SARS—11%, as well as a number of other diseases in humans (TB; 43%, tetanus 50%, Ebola 83%). The real problem is, of course, that these diseases are much less infectious. Not only do they affect fewer people, but they affect fewer people at one time. Much higher numbers of people will require critical care beds and ventilators, which will not be available through the peak of the epidemic. This seems set to push the deaths into the millions over the coming months, particularly in poorer countries.
Further Resources/Bibliography
A talk with more in depth discussion of previous epidemics and SARS-CoV-2 (excellent not least because it’s free)
Mims’ Medical Microbiology (Chapter one in particular is likely to be useful if you can find a PDF)
Principles of Virology (an excellent textbook if you’re looking to go further; useful throughout for this topic)
Biocalculus—Stewart and Day (probably the least useful listed here)
This seems wrong. The metric of how likely you die when you are infected is the infection fatality rate. The case fatality rate is how likely you are to do once you are diagnosed with the virus and became a medical case.
This is true, and a mistake on my part (they don’t bother with IFR in medical school, likely because it’s not as relevant for day-to-day medicine as CFR). I’ll update the post to try to explain the difference. Thanks a lot.
Thanks and that was very helpful and a good level for me.
I do have two questions, one just more a curiosity than really important.
I had been thinking that the spike binding with the ACE2 protein on the cell wall was actually the entry path—perhaps based on a misconception. My (limited) understanding is that the ACE2 protein actually forms a tunnel through the cell bi layer, so serves as a mechanism to allowing both compounds made by the cell’s machinery to exit the cell for external functions and/or allow the cell to allow something in it needs.
However, your description seem to be describing a case where the two lipid bi-layers really merge. Is that what is really happening, the virus is not really using the spike to actually puncture the human cell. It really only hooks on to the ACE2 protein and then the two walls, the human cell and the virus wrapper, just merge into a common wall? Maybe a bit like what happens internally when an organelle buds off from the smooth ER and then links up with the Golgi apparatus?
That was the curiosity question.
I’m still not clear how how the infection is really working here. The virus binds with the cell and so now the nucleocapsid is now inside the cell but still in its protein wrap so the RNA is not really exposed. How is the RNA exposed?
To your first point: my intuition is that ACE2 is far too small for the genome to pass through itself. ACE2 is an enzyme that’s bound to the membrane—it actually just cleaves angiotensin 1 to angiotensin 2 (hence ‘angiotensin cleavage enzyme, ACE2). It does pass through the membrane, but it’s not really a ‘channel’—it is simply localised to the cell membrane, and acts on substances extracellularly.
Enveloped viruses can enter cells in many ways (principles of virology chapter 5 is really excellent for this, if you’re interested). It seems that SARS-CoV (the original outbreak) enters cells primarily how it is implied above—simple, direct membrane fusion mediated by the ACE2 receptor. There is some speculation that it may under some circumstances be endocytosed (taken into the cell in a separate sphere of membrane) and then break free of the endosome (the bubble) in a pH-dependent way. Obviously this is further complicated by the fact that this is SARS-CoV-2 that we’re really interested in, so I thought it would be best to leave it blank. You’re right in thinking this process is similar to SER budding, though.
To your second point: I wasn’t actually sure! I’ve done some research, but honestly I’m still not as confident about this as about the rest. As far as I can tell, for most viruses nucleocapsid shedding is either mediated by substances or organelles inside the cytoplasm—ribosomes in particular, apparently, bind to the capsids of some viruses and destabilise them—or is part of the process of receptor binding. Some viruses, for instance, seem to be able to leave their nucleocapsid behind with their envelope so it coats one side of the cell membrane.
Sorry I can’t give a better answer, hope it helps!
Brook’s response is pretty good. I can provide a little more detail.
The spike protein of the virus both mediates binding to the ACE2 protein (which allows it to attach to the cell in the first place) and the fusion of the membranes. ACE2 is not involved in the fusion event, that is completely mediated by the spike, all ACE2 does is allow binding that brings the two membranes close together for a long time. The spike has two functional domains, one that is highly variable across coronaviruses that mediates attachment and is the reason different viruses attack different species and cell types, and one that is more highly conserved that triggers the membrane fusion. In order for the fusion to occur, the spike protein has to be processed by a protease that actually cuts the fusion domain apart from the binding domain. This does not make them fall off each other, they remain bound, but they no longer have a continuous backbone. This then allows a re-folding of the protein to a lower-energy state, which drives the fusion of the closely opposed membranes.
It appears from the literature I have found that the re-folding requires an acidic pH, suggesting that fusion probably requires endocytosis of the virus into the lysosome as it goes along for the ride with recycled ACE2 protein. (This is one of several reasons that chloroquine and hydroxychloroquine are being studied for efficacy, they are known to reduce the acidification of this cellular compartment.) They are still arguing about if the current virus has the spike protein cleaved during synthesis, or cleaved by proteases that are present in the lysosome where the ACE2 is recycled. One of the distinguishing characteristics of this virus compared to other coronaviruses is extra cuttable sequences between the two domains allowing more proteases to more easily cut the two domains apart, causing faster and more reliable viral entry. This has been noted in virulent strains of multiple other viruses in the past.
What about virulence? Does it take a lot less viral particles to cause a Covid infection compared to the flu?
That’s gonna take a lot of experiments on monkeys to get the answer to...