Small-scale primary source material, such as state or county data, is better than large-scale aggregate primary source material.
Secondary source interpretation can be more, not less valuable when created by a single person (as opposed to a news site) (even if that individual does not have a medical background), as an individual is more likely to look for useful information that can help them decide whether or not to take a specific action and share useful detail that explains how they came to that conclusion.
Assume something is lost/miscalculated/false-priored with every aggregation.
Assume all primary source material and all interpretations thereof are compromised for some reason (biases, incentives, etc.). Ignore actual numbers. Watch for trends.
I’ve also been doing a fair amount of on-the-ground evaluation, e.g.:
Do I know people who currently have COVID?
Are people around me (strangers in grocery stores, etc.) visibly ill?
Do I know people who have had post-vaccine health issues?
How does my experience correlate with the experience the data says I should be having?
One of the points of OP to be that aggregations like the CDC data tracker are not themselves primary source material. Like, the chain goes “person provides sample” → “sample gets processed” → “result gets recorded locally” → “result gets aggregated nationally”, and each of those steps feels like it has some possibility for error or bias or whatever. That CNN is even further from ground seems useful to know, but doesn’t tell us how connected the CDC is.
Agreed (which is why I noted that county data could be more valuable than aggregated CDC data, and that nuance has the potential to be lost with every aggregation), and I spent a good 30 minutes after writing this comment asking myself if there is a better term than “primary source,” which I probably used incorrectly above.
That said, it’s fair to note that I didn’t actually answer the question asked, because I don’t know how to determine the reliability of any given number (or any given source providing any given number). How are other people doing this?
Hooooo boy.
Here is how I have been evaluating data, curious to know if other people are making judgments based on similar inputs:
Primary source material (CDC data tracker) is better than secondary source interpretation (CNN COVID newsfeed).
Small-scale primary source material, such as state or county data, is better than large-scale aggregate primary source material.
Secondary source interpretation can be more, not less valuable when created by a single person (as opposed to a news site) (even if that individual does not have a medical background), as an individual is more likely to look for useful information that can help them decide whether or not to take a specific action and share useful detail that explains how they came to that conclusion.
Assume something is lost/miscalculated/false-priored with every aggregation.
Assume all primary source material and all interpretations thereof are compromised for some reason (biases, incentives, etc.). Ignore actual numbers. Watch for trends.
I’ve also been doing a fair amount of on-the-ground evaluation, e.g.:
Do I know people who currently have COVID?
Are people around me (strangers in grocery stores, etc.) visibly ill?
Do I know people who have had post-vaccine health issues?
How does my experience correlate with the experience the data says I should be having?
One of the points of OP to be that aggregations like the CDC data tracker are not themselves primary source material. Like, the chain goes “person provides sample” → “sample gets processed” → “result gets recorded locally” → “result gets aggregated nationally”, and each of those steps feels like it has some possibility for error or bias or whatever. That CNN is even further from ground seems useful to know, but doesn’t tell us how connected the CDC is.
Agreed (which is why I noted that county data could be more valuable than aggregated CDC data, and that nuance has the potential to be lost with every aggregation), and I spent a good 30 minutes after writing this comment asking myself if there is a better term than “primary source,” which I probably used incorrectly above.
That said, it’s fair to note that I didn’t actually answer the question asked, because I don’t know how to determine the reliability of any given number (or any given source providing any given number). How are other people doing this?