Experiments are in the territory, results are in the map

I recently read Thomas Kuhn’s book The Structure of Scientific Revolutions. Scott Alexander wrote up a review years ago, which I mention so that I don’t have to summarize the book. The claim in Kuhn’s book which I want to focus on is that the same experiment might have different results in different scientific paradigms.

Kuhn insists that this is not merely the result of scientists under different paradigms seeing the same thing and interpreting it differently in the latter half of his section Revolutions as Changes of World View. He freely admits that he hasn’t developed a complete replacement for that theory, but he offers some interesting historical examples of scientists looking at basically the same thing after a paradigm shift and seeing something different. Among them are how European astronomers found dozens of new “planets” in the decades immediately after the Copernican paradigm was accepted that he argues had previously been recorded as immutable stars. There was no accompanying technological improvement of note, and he also notes that Chinese astronomers with no ideological attachment to immutable heavens did so earlier. Another fun example is the revolution of chemistry after Dalton got everyone believing that atoms combined in whole-number ratios to make molecules. Suddenly, chemists start writing down the weights of the components of compounds in such a way that whole number ratios become obvious. The ratios of compounds measured with the same equipment become closer to whole number ratios in the generation of experiments following the acceptance of the new paradigm as well. I don’t think he gives a mechanism for why he thinks this is happening, but his wording at the very end of the section that they had to “beat nature into line” rules out neither confirmation bias nor a higher degree of care in measuring things which became relevant to the paradigm. He is clear to say that reality has not changed, but he thinks that the perception of the scientists is different while making analogies to optical illusions and such which frankly makes me confused about what he’s actually trying to say.

I suspect that he’s twisted himself up into a philosophical pretzel over it, but Kuhn’s claim and examples of how new generations of scientists found new implications from old experiments points at something interesting that I want to explore a bit. The claim that the results of the same experiments are different under different scientific paradigms is either incredibly obvious or incredibly subtle depending on how you interpret it. The obvious interpretation is that if you don’t agree on what the world is made out of or what you’re looking for in it, you won’t say that experimental results mean the same thing. The more subtle interpretation is that your ability to observe experiments is so far above the level of the machine code of the universe that any experimental result you come up with will be composed of abstractions that you can’t treat as objective.

Let us examine the obvious interpretation. Kuhn talks about how Aristotelian physics said that objects wish to go to the lowest point they can, though resistance may keep them at a higher one. From their point of view, a pendulum which swings around a lot before settling at its lowest point is merely falling with style. However, Galileo had adopted the medieval theory of impetus, and he saw a pendulum as a thing which was swinging back and forth with the same frequency forever. Kuhn notes that Galileo’s experiments reflected this fact to an even greater degree than later ones which were more careful, and the independence of period and amplitude is “nowhere precisely exemplified by nature” itself. At any rate, these paradigms are incompatible, and yet both could take victory from the pendulum experiment, because they cared about different things. Per Kuhn, “the Aristotelian would measure (or at least discuss-the Aristotelian seldom measured) the weight of the stone, the vertical height to which it had been raised, and the time required for it to achieve rest… Neoplatonism directed Galileo’s attention to the motion’s circular form. He therefore measured only weight, radius, angular displacement, and time per swing...interpretation proved almost unnecessary.”

The subtler interpretation says that you can’t measure anything without having a model for what the thing is. I think about this like recognizing constellations. Doing an experiment is like looking up at a starry sky, finding a constellation, and saying what time it is. If someone asks you how you know what time it is, you say something like “Virgo is over Mt. June” or whatever, and this is a result. This statement of the result is meaningless outside of your culture though. Virgo is just a line around some dots you saw. When you report the mass of an object using a balance, you are similarly drawing a line around a bit of reality and calling it “mass.” It is meaningful to you. You can do some experiments where you put a rocket on objects of different masses, measure displacement over time, and show that the acceleration follows what Newton’s laws predict it should be for that mass. That’s the thing though, you can’t measure “mass” directly. You need to make a theory that says gravity pulls on equal masses with equal force and the balance will be in balance when equal mass is on either side of it. Mass is only real to the extent that it has predictive power in your paradigm and that prediction reflects reality. You might try to say “look at reality, the giant thing has mass because it crushes me if I try to hold it up and it’s hard to move” but that’s confusing map for territory, no matter how well you’ve aligned the two. You need a circular definition of mass as the thing that gravity pulls harder or that has more inertia.

This is particularly compelling for me as a person who works with particle physics. When you do a particle physics experiment, you’re generally measuring the momentum of some stable particle in order to find a mass or excited state or size or whatever of some unstable or otherwise difficult-to-measure-directly particle. We can only measure properties of these objects based off of what our theories say they will do to other objects, so it is natural to think of the results of an experiment as model-dependent. We can try to get around this by saying things like “the result of this experiment was that I took this thing, smashed it into this thing, and saw these energy depositions in these detectors at these times,” (which sounds awesome, and maybe we should do more of that,) but at some point you need to update your model and claim that the update came from experimental evidence. The Standard Model also has some funny interpretations of experiments like masses which are different when you probe the mass with higher energy particles and bare masses which are dependent on your renormalization scheme (I feel like I should link something if I say something like that, so here’s a link to a section of a Wikipedia article that sort of mentions both of these things). I am at least open to the idea that the very things we’re measuring are built on top of such a tower of abstraction, that we shouldn’t say that our measurements have a one-to-one correspondence to reality. The mass of a proton is a thing which has a direct analog to the Newtonian mass of an ordinary object, even though our best theory of matter says that most of that mass is the binding energy of the quarks due to the gluon field, which does not seem like what Newton thought “matter” was. Nevertheless, we can build up chemistry by figuring out how many protons and neutrons there are in an atom, multiplying that by our so-called “mass of the proton,” and using that number to make good predictions for how much a lump of carbon or whatever will accelerate if you put a force on it.

I might be able to clarify what I’m saying with a Bayesian interpretation of this. As a model for an experiment, consider a simple “is true given an observation of ” update, which could look like

or “[odds of given ] is [prior odds of ] times the ratio of [ given ] to [ given not ]”. The result of an experiment is a probability distribution across some beliefs given your observations. Above, the result is the set of probabilities and . There are two obvious ways you can end up with a different probability distribution. You could start with different priors. Above, the priors are and , which could be different for different observers. The other way is to disagree on the probability of various observations. Referring to the equation above, your model might say and another model might say , which would change your posterior odds even if each model had the same prior. A more subtle way to end up with a different result is by having different observations in the same experiment via, for example, selective attention toward different consequences of the experiment (including grouping together observations which could be meaningfully distinguished by a more sophisticated theory or better measurement). This is technically disagreeing on the probability of various observations given the prior, but it’s not that you’re disagreeing on the probabilities so much as that you’re disagreeing what the observations are that you attach probabilities to. I think in terms of the equation above, two people are looking at the same experiment, but one of them is making the observation of while the other is paying attention to some different that may be entirely or subtly different from , even though the piece of reality that they’re interacting with was the same.

The important thing here is that an experimental result is not a static thing. An experiment doesn’t prove anything, it provides evidence. That evidence can make updates to multiple hypotheses. When we introduce a new theory, old experiments need to be seen to back it up up, even though that is not what they were originally performed for. The new theory needs to add up to normality. We would like to beat the historical record of scientists. We don’t want to have to wait for an old generation of scientists to retire so that a new one with a better framework can take over; we want to be able to switch from one model to another model if that model is more aligned with the territory. Even if you’re not trying to make new paradigms, if you find yourself working in a new paradigm it is worth reconsidering the results of old experiments. Historical evidence suggests it may be worthwhile to rerun those experiments. You might see something relevant to you that the old experimenters did not.