The median academic paper is a good hypothesis away from actually making incremental scientific progress. The top 10% of papers written in such a manner, let’s call it “paper milled”, inspire exploration and follow-up in other scientists. They can become highly cited. They can save other people a LOT of time.
Let me give an example of something I’m working on. My team was looking at Clonal Hematopoiesis of Indeterminate Origin (CHIP), non-disease-causing mutations in cancer-driving genes that appear in white blood cells as people age. They were causing interference in our cancer panels. We were trying to disentangle the effects of selection pressures and mutation hotspots on the appearance of CHIP. How much was due to higher replication rates, and how much was due to the higher likelihood of mutation in specific locations? We were going to do a time-consuming analysis of our data before I found several twin studies, which found no significant difference between identical and fraternal twins (same sex) in terms of CHIP correlation. Turns out, mutation hotspots didn’t play a major role (there are a lot of caveats here, obviously; I’m oversimplifying).
These papers, which were simple analyses done on an existing data set, probably saved us a hundred man-hours. They may have been a formulaic analysis of other peoples’ data, but are still fundamentally good and useful papers that advance science forward. Soon, I’ll be using AI to sort through the enormity of papers to find useful ones like these. I’ll also be using better hypotheses with the aid of AI.
AI is going to matter. It’s already mattering. This is important for the average scientist.
The median academic paper is a good hypothesis away from actually making incremental scientific progress. The top 10% of papers written in such a manner, let’s call it “paper milled”, inspire exploration and follow-up in other scientists. They can become highly cited. They can save other people a LOT of time.
Let me give an example of something I’m working on. My team was looking at Clonal Hematopoiesis of Indeterminate Origin (CHIP), non-disease-causing mutations in cancer-driving genes that appear in white blood cells as people age. They were causing interference in our cancer panels. We were trying to disentangle the effects of selection pressures and mutation hotspots on the appearance of CHIP. How much was due to higher replication rates, and how much was due to the higher likelihood of mutation in specific locations? We were going to do a time-consuming analysis of our data before I found several twin studies, which found no significant difference between identical and fraternal twins (same sex) in terms of CHIP correlation. Turns out, mutation hotspots didn’t play a major role (there are a lot of caveats here, obviously; I’m oversimplifying).
These papers, which were simple analyses done on an existing data set, probably saved us a hundred man-hours. They may have been a formulaic analysis of other peoples’ data, but are still fundamentally good and useful papers that advance science forward. Soon, I’ll be using AI to sort through the enormity of papers to find useful ones like these. I’ll also be using better hypotheses with the aid of AI.
AI is going to matter. It’s already mattering. This is important for the average scientist.