Interesting that you should prefer ‘they’, referring to the plural data; some versions of the aphorism also use this form—in retrospect, I prefer this form.
Torturing data is a common problem in my field (geophysics). With large but sparse datasets, data can be manipulated to mean almost anything. Normal procedure: first make a reasonable model for the given context; then make a measureable prediction based upon your model; then collect an appropriate dataset by ‘tuning’ your measuring apparatus to the model; then process your data in a standard way. In the case that that your model is not necessarily wrong; then make another measureable prediction based upon your model; collect another dataset by an independent experimental method; then …
Even when following this procedure, models are often later found to be wildly erroneous; in other words, all of the experimental support for your model was dreamt up.
What I was thinking about when typing that was indeed a model by some geophysicists. They had found some kind of correlation between some function of solar activity and some function of seismic activity, but those functions were so unnatural-looking that I couldn’t help thinking they tweaked the crap out of everything before getting a strong-enough result.
You were likely referring to some of the recent work of Vincent Courtillot. A video summarizing some of his work here.
The most interesting aspect of this work, is that Courtillot did not start out with any intention of finding correlations with climate; his field is geomagnetism. Only after noticing certain correlations between geomagnetic cycles and sun spot cycles, did suspected correlations with natural climate cycles become evident.
But if you torture them too long, they will confess falsehoods.
Interesting that you should prefer ‘they’, referring to the plural data; some versions of the aphorism also use this form—in retrospect, I prefer this form.
Torturing data is a common problem in my field (geophysics). With large but sparse datasets, data can be manipulated to mean almost anything. Normal procedure: first make a reasonable model for the given context; then make a measureable prediction based upon your model; then collect an appropriate dataset by ‘tuning’ your measuring apparatus to the model; then process your data in a standard way. In the case that that your model is not necessarily wrong; then make another measureable prediction based upon your model; collect another dataset by an independent experimental method; then …
Even when following this procedure, models are often later found to be wildly erroneous; in other words, all of the experimental support for your model was dreamt up.
What I was thinking about when typing that was indeed a model by some geophysicists. They had found some kind of correlation between some function of solar activity and some function of seismic activity, but those functions were so unnatural-looking that I couldn’t help thinking they tweaked the crap out of everything before getting a strong-enough result.
You were likely referring to some of the recent work of Vincent Courtillot. A video summarizing some of his work here.
The most interesting aspect of this work, is that Courtillot did not start out with any intention of finding correlations with climate; his field is geomagnetism. Only after noticing certain correlations between geomagnetic cycles and sun spot cycles, did suspected correlations with natural climate cycles become evident.