Distribution of knowledge and standardization in science
Modern technology facilitates the distribution of knowledge immensely. Through Google and Wikipedia, billons have access to an enormous amount of knowledge that previously was hard to get hold of. Still, there is lots of knowledge that is not getting properly distributed, something which reduces the rate of progress. What I’m thinking of is that there is lots of scientific knowledge which would be useful for other scientists, but which fails to be distributed. For instance, when I read history it strikes me that they should have an enormous use of the advances of cognitive psychology in their explanations of various actions. (The same presumably holds for all social sciences. The one discipline that seems to have made lots of use of these findings is economics.) Likewise, statisticians, mathematicians and philosophers of science could, if given the time, point out methodological problems in many published articles. Even within disciplines, lots of useful knowledege fails to be distributed adequately. It happens quite often that someone knows that an article is quite badly flawed, but still does not distribute this knowledge, because of a lack of incentives to do that.
I think that one major problem is that scientific knowledge or information is not properly standardized. Historians can get away with explanations that are grounded in naive/folk psychology rather than scientific psychology because it does not seem like they are making use of folk psychology when they explain, e.g. why this and this king did so and so. Or at least it is not sufficiently salient that their explanations are psychological. Likewise, social scientists can get away with making normative arguments which are flawed, and known to be flawed by philosophers, because it is not immediately obvious that they are giving precisely that argument.
A lack of standardization of terminology, research techniques, and what have you, makes it harder both for humans and intelligent machines to see connections between various fields, to spot flawed arguments, etc. Can such a standardization be carried out, then? This is a perennial question, on which some take the stance that all such standardization attempts are doomed to fail whereas others are more optimistic. Against the former view, I would argue that we have already reached a quite high degree of standardization regarding lots of aspects of research—style conventions, bibliography, what significance levels to use when applying certain statistical techniques, terminology within certain disciplines, etc, etc.
Then again, the anti-standardizer has a point that requiring all researchers to use certain techniques and certain terminology in a certain way risks to become a straitjacket. The answer, then, seems to be that science can be standardized to some extent but not through and through. This is, however, a quite boring and uninformative answer: it leaves open what we can standardize and what we cannot, and to what extent the former can be standardized.
Rather than speculating in the abstract, the best solution is perhaps to try to come up with concrete standardization ideas, and discuss whether they would work. One idea I had, which I am not sure I believe in myself, regarding terminological standardization, is that researchers should be encouraged to write short versions of their articles that would be written on a “canonical form”; which only would use a certain certfied terminology. These short versions or abstracts would, in many cases, not say exactly the same thing as the main article, but they would give an indication of what other sorts of information (e.g. from other disciplines) would be relevant for the article. This would facilitate for researchers from other disciplines (since they would not have to know the terminology of the main article, but merely the translated terminology of the short version) and also, and possibly more importantly, for computer programs. You could feed these short versions into programs which could give you all sorts of valuable data: which disciplines use which sorts of notions, which special expertise seem to be lacking in certain disciplines (e.g. they might use lots of psychological notions without actually knowing psychology, etc).
Another issue is the standardization of research techniques. In medicine, for instance, there has been a lot of discussion on whether randomized controlled trials should be used more or less exclusively or whether other procedures are also valuable. Some people (such as Nancy Cartwright) argue that such standardization attempts are dangerous—that different techniques are appropriate for different problems—whereas others argue that not having very strict rules for how medical research is to be carried out opens the door to arbitrariness and bias.
When one reads scientific papers using stastitical techniques, one is often struck by the number of relatively simple mistakes. These mistakes are not due to some mysterious lack of tacit “practical knowledge” but are rather quite easy to point out and explicate. It would seem to me that this should give room to a more structured research process: one where researchers follow a more or less standardized procedure. I take it that this is already done to some extent.
Anyway, my aim with this post is not so much to give concrete suggestions as to raise a discussion. What do you think: can we standardize scientific methods and terminology? If so, how and in what respects? What consequences would that have? I would much appreciate any suggestions on this topic.
Or look at existing efforts. There’s two main categories I know of, reporting checklists and quality-evaluation checklists (in addition to the guidelines/recommendations published by professional groups like the APA’s manual based apparently on JARS or AERA’s standards).
Some reporting checklists:
STROBE (checklists; justification) for cohort, case-control, and cross-sectional studies
TREND (checklist; justification) for non-randomized experiments
CONSORT (checklist; justification) for RCTs
QUOROM/PRISMA (checklist; justification) for reviews/meta-analyses
Some quality-evaluation scales:
Jadad scale
NOS (scale; manual)
CEBM
I’m generally in favor of these. The recommendations are usually quite reasonable, and a tremendous amount of research neglects the basics or adds in pointless variation—certainly, people will plead that methodologist-nazis will cramp their style, but when I try to meta-analyze creatine and find that in the ~10 studies, they collectively employ something like 31 different endpoints (some simply made up by that particular set of researchers), most for no compelling reason, so that only 2 metrics were used in >3 studies, it’s hard for me to take this concern seriously—we clearly are nowhere near an equilibrium where researchers are being herded into using all the same inappropriate metrics.
Thanks, very good post.
I agree. This reminds me of this post:
Standardization of techniques and terminology could be seen as a kind of “censorship” (certainly it is seen as such by the people who talk of methodologist-nazis). Just as academics fail to see how many advantages the kind of censorship the academia’s system for excluding ignorants has, it is easy to fail to see the many advantages the standardization of terminology and techniques we already have brings with it. The present academic system, and certain terminology and techniques, have become so natural to us that we fail to see that they are the result of a process which in effect functioned as a kind of censorship (“don’t do so and so if you want to be taken seriously”). Hence we fail to see how many advantages such censorship has over anarchism.
Excellent material as usual, thank you.
Obligatory xkcd
IMO what we need is online revision-controlled semantic databases of scientific knowledge which can be edited by all researchers (at least), are freely accessible by everyone and have convenient APIs for data mining.
Interesting, but how would that work out in practice? Should anyone be able to revise, e.g. books? Who should decide in the case of conflict? Should we use majority voting? Weighted majority voting? Should the original author have a special say?
I am not talking about books. Think about something like Wikipedia only semantic (that is, having formal structure allowing for automated queries e.g. a relational database). I imagine the revision process should also be along the lines of Wikipedia i.e. users work out problems among themselves, if they fail community-appointed administrators resolve it.
Like Wikidata and Wikispecies?
Yep, I’d say these are good examples.
DBpedia does provide a way to query wikipedia with automatic queries.
For a lot of knowledge it’s however not trivial to structure it in a useful way. Turning an abstract of a scientific paper into computer readable knowledge is not trival and requires a lot of thinking about ontology and the structuring of knowledge.
It’s the same problem you have when writing Anki cards. It’s very hard to formulize real knowledge.
Agreed. However, there is a lot of knowledge which is easy to structure. Examples: minerals, crystals, chemical substances, chemical reactions, biological species, genes, proteins, metabolic pathways, biological cell types, astronomical objects, geographical / geological objects, archaeological findings, demographical data (including historical)...
In practice we need a mixture of structured and unstructured (like regular Wikipedia) information.
There no real reason why proteins and acheological findings should be in the same database.
Uniprot with both Swissprot and Trembl works well.
PubChem is in principle a good way to store information about chemical substances. I don’t know much about crystals but do we need a database about them, that separate from PubChem?
From what I heard the data quality of PubChem isn’t ideal. But that not a problem easily solved by creating a new database.
http://eol.org/
I don’t know much about astrophysics, but I would be surprised if those folks have enough money to buy all those telescopes but not enough money to have a good database of astronomical objects.
OpenStreetMap is open data for geography. Do you think it lacks something?
I think there are databases for those things http://en.wikipedia.org/wiki/List_of_biological_databases#Metabolic_pathway_and_Protein_Function_databases lists a bunch.
http://www.obofoundry.org/ also provides nice information. In bioinformatics there are plenty of people who care about organising databases of knowledge that’s easy to structure.
I have no idea on that front. It could be that the related academics don’t use computer enough to have a decent database.
I don’t know of the correct source, but there probably a lot of complicated copyright involved. Different definition of terms are also complicating things. Illegal drug sales got recently added to the [British GDP] (http://marginalrevolution.com/marginalrevolution/2014/02/improving-gdp.html). Having a database that makes it easy to compare numbers won’t be easy.
Great links, thanks! The situation looks much better than I assumed.
Probably not separate. However, it seems that PubChem doesn’t store data about crystal structure, unless I’m missing something (I looked at the entries for SiO2 and NaCl)? Also, PubChem doesn’t seem to have lots of data about reactions.
For geography it’s probably good, but it doesn’t seem to have much data about geology, unless I’m missing something? The latter would require some sort of a 3D map of the Earth crust.
In general in the last decade a lot of people in the bioinformatics community tried to find solutions to problems in that sphere.
People like Barry Smith did a lot of work on ontology and we know have bioinformatics driven ontology for emotions because they psychologists just don’t work on that level. When it comes to what the psychologists themselves produce they are stuck with utter crap like the DSM-5. The DSM get’s produced by the American Psychological Association.
PubChem is probably reasonble good where it touches areas that bioinformatics is interested in but crystals aren’t in that sphere.
A lot of information about chemicals that’s out there is also intellectual property of big pharma companies who aren’t happen with sharing it in a open fashion. The American Chemical Society fought against PubChem being well funded.
It an interesting pattern. Bioinformatics might work preceisely because it has no huge society of bioinformaticians that can hold back scientific process in the way the association of the chemists and psychologists do.
I don’t know exactly, but I think if the data is available it should go somewhere in that project.
I feel like I would have gotten more out of this post if there were examples, or at least links to examples of some of the claims.
What is an example of a claim made by historians rooted in folk psychology?
What is an example of a social scientists argument which is flawed and known to be flawed by philosophers?
What are some examples of scientific papers making relatively simple mistakes?
ETA: Even better would be some analysis on how widespread these problems are in their respective fields.
There’s a statistical mistake appearing in half of academic neuroscience papers: http://www.theguardian.com/commentisfree/2011/sep/09/bad-science-research-error
Yes I admit it’s a bit thin. My aim was not so much to drive a thesis (I don’t know enough about the topic to do that) but to raise the issue. I think it was quite succesful given the level of discussion—e.g. I learnt a lot from gwern’s excellent post above. That said, I could have been a bit clearer.
Anyway, I think one should be able to post this kind of posts in the discussion section (whereas in the main section you need to be more precise and detailed).
Agreed. I still would have liked more examples! :D
I don’t see why this is good. For example, crossing the 5% “statistical significance” threshold is a more or less standard requirement to get published. That’s not something desirable.
And, of course, there is the overall objection that science is about finding out new stuff and that new stuff does not always fit well (or at all) into the usual/established/standard ways of doing things.
Part of doing science is the development of new methods and terminology.
No, the best way is too look at standardization effort that actually exist in the real world.
But standardization can also be dangerous as it generates power. The ACS (American Chemical Association) for example introduced CAS numbers for researchers to identify chemical compounds and encouraged researchers in articles to use those compounds to identify what substances they are speaking about.
The problem is that they want you to pay money every time you want to look up a compounds via it’s CAS number. They sued Wikipedia for putting CAS numbers in their articles. If you want to mine the literature you have to do hundred thousands of lookups and can’t just pay what the ACS wants at per compounds lookup prices.
Given how crapy the new DSM5 has become the NHI made a move to less standardization and encouraged researchers to develop other ways of mental disease classification.
If a biologists finds a method that gives him 99% of the time the results he wants he is really happy. If a mathematician has a methods that gave everyone in history the same result that’s an assumption that might or might not be true and is in need of a mathematical proof. Trying to get the biologist and the mathematician to agree on the same terminology is going to be very hard because they have very different standards.
The need for computer understandable text has driven bioinformaticians to publish huge ontologies of vocubulary.
I think that’s primarily a problem of peer review where only the colleagues of a researcher are asked but it would be good someone with a statistics background would look over every paper.
Sure, but standardization has, by and large, been to the better, I would argue. If we hadn’t converged on e.g. a specific terminology on measuring various quantities, communication would have been much harder (and to the extent that we have not converged, for instance because Britons and Americans use alternatives to the metric system, this gives rise to lots of unnecessary complications.) But you’re right that it could lead to problems. ACS’ practice seems quite malign.
The point about it being hard to get different disciplines agree on one terminology because of different standards and norms is a good one. I agree with it, and I don’t think consensus is possible. The question is whether it is necessary. I’m thinking one could try to set up this system and let researchers join if they want (e.g. if they want to advertise their results through the system). Postmodernists and others who would not want to use the standardized terminology would not join, but if the system became sufficiently influential, this would presumably be to their disadvantage.
My idea is very much in line with the Unity of Science movement; an idea that I am unfashionably positive to. Science should not be compartmentalized—even though there should of course be a thorough division of labour science should be a heavily integrated enterprise.
‘Influence’ is a strange thing as it depends not only on the factors we want it to depend on. ‘Ideally’ we’d want scientific results to be the ultimate criterion but the effort to publish in high impact journals shows that it is not so. It could a very long time until such a system is fixated in the scientific ecosystem.
I’m learning more and more that my thoughts are not original.
It could be worse but a non-profit scientific association holding back the advancement of science through locking up knowledge is a serious issue.
The DSM-5 is more problematic. People who get hit strongly on the head often develop an depression. Should psychologists really diagnose them with the same “depression” as the person who’s depressed because his parent died?
I don’t think so. That means we need to develop better vocabulary and better tests to distinguish those two people who get labeled the same via DSM-5 criteria.
It would also be nice to have criteria for depression that are optimized in a way that two psychologists will agree on whether a particular person is depressed.
Let’s say the fMRI people finally do something useful and find a way to diagnose depression via fMRI and give us an objective numerical score for how depressed someone is. Depression_fMRI won’t be exactly the same thing as depression_DSM5.
Maybe we need 5 different depression definitions and test interventions for every depression definition separately to find the best treatment for particular patients.
Looking up the definition of what depression means prevents that process from going forward.
At least some of the frontier of science will always be unstardized.
Standards FOLLOW knowledge and are difficult to establish. Standards developed too early are often abandoned or at least deprecated as standards developed after more knowledge prove to be better. In an important sense, English units in science and measurement, gallons, feet, yards, knots, miles, BTU, psi, etc are inferior in many ways to metric a.k.a. SI standards, which came about later. Interestingly SI has not fully driven out English units, but it has certainly deprecated them. Meanwhile, I deal all of the time with things like grams/mile of CO2 emissions from cars, where the astute reader will recognize that this should be lbs/mile or grams/kilometer, but in the US we drive in miles and there is no avoiding that, while our scientists are rather more able to deal with grams of CO2.
SO as long as science is advancing, there will be areas where standards are being threshed out, and probably some areas where they are being threshed out too soon, and so much of the valuable work will be unable to use these nascent standards.
It is not a new observation that standards are useful and so should be developed, but rather an old and ongoing one.
A trivial example: Medical tests are usually reported without units, but there are usually several common units, in particular mass per volume and moles per volume.
Quite often you have the possibility to go for a standard or to avoid doing so. If everyone would just go with the official standard that weeks begin with Mondays, distance is measured in meter and mass get’s measured in kg, that would be progress. Getting all those people who derivative for the standard to change their behavior is difficult.
If you really like standardization read a few ISO standards that touch domains in which you publish information.
Funny you should mention that. Laboratory courses in physics at my uni are known for enforcing arbitrary seeming rules where the TAs are sometimes unable to agree on what exactly those rules are. This is despite the fact that there are standards published by the national standards organisation concerning how to write down experimental data and how to properly deal with measurement errors. Which I can’t acces for free through my uni.
Standards often are arbitrary and not only seem to be that way.
But at least they are formalised and written down. In that damned course they are not.
Of course we can standardise science without much creative loss. We just have to be aware of the potential loss and give space to non-standardised methods. This is similar to how obviously we can standardise education or at least testing without much creative loss. In both cases people, usually teachers, bemoan some kind of loss of freedom or creativity. I posit that in both cases similar incentives stop standardisation being implemented.
More generally, we need a better picture of our social institutions to analyse this kind of situation. What do we want to encourage in science and education? If we can’t answer this question clearly we can’t give a clear answer to how we should design the specific institutions, among them standardisation.
How do you know this? How good are people at deliberately leaving room for non-standardized methods?
Good points. I think it’s easier to standardize education though, since you encounter more similar problems there all the time. Teaching nine-year olds arithmetic is a pretty similar problem regardless of what class you have (at least given a certain cultural setting) whereas scientific problems are more variable.
However, I also think that in both cases, people bemoan the loss of freedom/creativity too much. Sure, following standard practices is often more boring and less glamourous, but it’s also often more efficient. When industries using standardized techniques replaced artisans, people bemoaned that too, even though it enhanced productivity vastly.
I also agree we need a good picture of our social institutions (i.e. science in this case) to answer the question how we should approach standardization.
Interestingly, first I wanted to argue that standardisation benefited industry in their workings. It was then that I realised we don’t actually know what science is exactly working for, so we can’t define a ‘scientific supply chain’ and decide where we’d want to formalise procedures. So I dropped that line of argumentation and submitted the above post.
A modest proposal: Anyone wishing to create a file format will be forced to write out a 10 megabyte example of the format, by hand, without error, before the new format will be accepted.