This seems like a useful and accurate overview of the general state of data utilization in many organizations.
In my work as a software engineer at a clinical research company, I’m frequently able to watch as my coworkers struggle to convince our clients (companies running clinical trials) that yes, it is critical to make sure all of available data entry options are locked to industry standardized terms FROM THE BEGINNING else they will be adding thousands of hours of data cleaning on the tail end of the study.
An example of an obstacle to this: Clinicians running/designing the trials are sometimes adamant that we include an option in the field for “Reason for treatment discontinuation” called “Investigator Decision” when that is not an available term in the standard list and the correct standardized code item is “Physician Decision”. But they are convinced that the difference matters even though on the back end the people doing the data cleaning are required to match it with the acceptable coded terms and it’ll get mapped to “Physician Decision” either way because the FDA only accepts applications that adhere to the standards.
In my opinion a common cause of this disconnect is those running trials are usually quite ignorant of what the process of data cleaning and analysis looks like and they have never been recipients of their own data.
As a pipe dream I would be in favor of mandatory data science courses for all medical professionals before letting them participate in any sort of research, but realistically that would only add regulatory burden while accomplishing little good as there’s no practical way to guarantee they actually retain or make use of that knowledge.
literally, he did not believe in probabilities between zero and one. yes, such people exist. he would say things like “either it is, or it isn’t” and didn’t buy it when we tried to explain that a 90% chance and a 10% chance are both uncertain but you should treat them differently.
...How does someone this idiotic ever stay in a position of authority? I would get their statements on statistics and probability in writing and show it to the nearest person-with-ability-to-fire-them-who-is-not-also-a-moron.
A couple minor edit suggestions:
Footnote [1] seems to have a missed “opportunity” after “every” in:
To put it bluntly, this statement feels like several related sentences were put into a hydraulic press and this was the result. Perhaps rephrase into multiple component sentences? Fewer words does not necessarily make something easier to parse.