Explanations as Hard to Vary Assertions
Update: After some investigation, I found out that The Beginning of Infinity by David Deutsch contains a few minor misquotes of Popper, Turing and others. Nevertheless, it is an excellent book.
Background
As I read through Rationality: A-Z, I kept seeing similarities to David Deutsch’s worldview. Deutsch pioneered quantum computation in the 1970s, motivated by the possibility of gaining a deeper grasp of quantum physics and as a potential way to test many-worlds.
This post is adapted from my review of The Beginning of Infinity. I read it a couple of years ago, and it is among the most formative books I have read.
Overview
We have a great deal of knowledge about the vast and unfamiliar reality that causes our observations and the elegant, universal laws governing that reality. This knowledge consists of explanations: assertions about what is out there beyond appearances and how it works. Where do explanations come from? The source of our knowledge is a process of conjectures alternating with criticism. Humans possess the capacities for creativity and rationality, enabling them to actively pursue error correction through creating, combining, altering and criticising ideas in the quest for good explanations.
Good Explanations
The role of experiment and observation is to choose among the ideas we come up with. We interpret experiences through explanatory theories, but good explanations are not obvious. Fallibilism entails not looking to authorities but acknowledging that we may always be mistaken and that no belief can ever be rationally supported or justified conclusively. Always, there remains a possible doubt as to the truth of the belief. What distinguishes science from other belief systems is that scientific beliefs are always defeasible and never final. We ought to continuously be correcting errors and updating beliefs in the quest for knowledge. We correct errors by seeking good explanations.
Good explanations in science are hard-to-vary assertions about reality. They are hard-to-vary because they provide specific details that fit together so tightly that changing them ruins the explanation. This criterion helps to eliminate bad explanations that keep adding justifications in light of refutations and counterevidence to avoid falsification. An explanation that is hard-to-vary but does not survive a critical test can be considered falsified.
We can explain what it means for a conjecture to be hard-to-vary in terms of Bayes’ theorem.
[Paraphrasing from Decoherence is Falsifiable and Testable] A good explanation offers precise assertions about reality. If there is some evidence that the assertion can’t explain, then the likelihood will be tiny. Thus, the numerator will also be tiny, and likewise the posterior probability . Updating on the near impossibility of evidence has driven the probability of the assertion down to epsilon. A theory that refuses to make itself vulnerable in this way will need to spread its probability widely by being vague. [Update: this is slightly incorrect because it doesn’t consider all possible hypotheses, please refer to Eliezer’s elucidation.]
Frank Wilczek describes hard-to-vary-ness as follows “A theory begins to be perfect if any change makes it worse.” He explains further using the Standard Model as an example of a hard-to-vary explanation:
Too many gluons! But each of the eight colour gluons is there for a purpose. Together, they fulfil complete symmetry among the color charges. Take one gluon away, or change its properties, and the structure would fall. Specifically, if you make such a change, then the theory formerly known as QCD begins to predict gibberish; some particles are produced with negative probabilities, and others with probability greater than 1. Such a perfectly rigid theory, one that doesn’t allow consistent modification, is extremely vulnerable. If any of its predictions are wrong, there’s nowhere to hide. No fudge factors or tweaks are available.
Good explanations help us achieve better map-territory convergence. They allow us to construct more accurate models of the territory. To grasp reality, we must resist the temptation to start from conclusions to bend facts to fit them. Grasping reality entails overcoming our cognitive biases, going on joyful explorations across the territory and improving our map along the journey.
Poor explanations purport to explain anything and everything. Such explanations explain nothing. Freudian psychoanalysis was equally good at coming up with an explanation for every possible thing the patient could do. Similarly, God and magic can explain anything and everything. Therefore, they offer us no explanatory power.
Some good explanations have enormous reach: they explain more than what they were initially intended to. In science, good explanations gave rise to the principle of Testability, which constrains a scientific explanation to be hard-to-vary. Still, good explanations go beyond science and apply to philosophy, politics, morality, economics, etc.
Bad Explanations
Explanations are two a penny. Good explanations are extremely hard to come by. Bad explanations are not necessarily false. They can be true but completely lacking in explanatory power.
Suppose you are watching a conjuring trick, and you are trying to explain what is happening. An example of a bad explanation would be, “Well, it is magic.” That is a bad explanation because you can apply that explanation to anything. Another example of a bad explanation is to say, “Well, the conjurer did something.” This shows that a bad explanation doesn’t necessarily have to be false but just utterly inadequate.
If we take, via analogy, the laws of physics and trying to explain things in the natural world, we could answer the questions “What is the origin of species?” and “What is the origin of adaptations in the biological world?” with “Atomic interactions cause them.” This statement is true, but it doesn’t explain. A good explanation of these phenomena is the modern variant of the Theory of Evolution.
Science and philosophy are both subsets of the quest for good explanations. Science and philosophy overlap, but Popper’s criterion of demarcation helps us avoid going down blind alleys. It states that scientific theories are in principle testable by experiment and metaphysical theories are the ones that aren’t while making no judgment about the validity of either type of theory.
Dogma
Bad philosophy—a subset of bad explanations—does not only contain falsehoods, but it also disturbs our ability to search for good explanations. False philosophy is not harmful; in fact, errors are the standard state of human knowledge. We can expect to find errors everywhere, including in the theories that we most cherish as true. However, bad philosophy is harmful because it aims to cut off the progress of knowledge, coercing us to remain in the dark. It is the kind of philosophy that not only makes false claims but more dangerously says, “You mustn’t think about so and so.”
Before the Enlightenment, the Church was the authority forcefully closing off the progress of knowledge to maintain its hegemony. Today, the scientific establishment has become the new Church, oppressing creativity and imagination. Science today insists that everyone believes in the same thing and in the same way.
Empiricism can be and has been misused and misapplied throughout the history of science. Galileo’s fellow scientists pointed to empirical evidence to resist his theory. For example, when a ball was dropped from the top of a tower on a sailing ship, the ball fell at the bottom on board. This suggested to them that the Earth was stationary. However, the theory of special relativity explains this observation through inertial frames of reference. This letter from Galileo to Kepler captures his frustration:
My dear Kepler, I wish we could laugh at the extraordinary stupidity of the mob. What say you about the foremost philosophers of this University, who with the obstinacy of a stuffed snake, and despite my attempts and invitations a thousand times they have refused to look at the planets, or the moon, or my telescope?
The observation of stars with the naked eye provides another example of a parochial error in science. Generations of philosophers and scientists speculated about the reality of stars in the night sky, convinced that twinkling was a real property of stars. Modern telescopes contain automatic mechanisms that continuously change the shape of the mirror to compensate for the shimmering of the Earth’s atmosphere. Observing through such a telescope, stars do not appear to twinkle as they did to generations of observers in the past. Those observations of stars twinkling are only appearances. These appearances are certainly real aspects of our perception, but they have nothing to do with the reality of stars. Thus, we cannot be certain about our observations.
Modern cognitive science tells us that our brains reconstruct visual reality. There is no such thing as a raw experience of reality. The famous lines illusion is an illustration of visual bias.
The cognitive processes which form our experiences have been forged over many millions of years of genetic variation alternating with selection. There is no reason to believe that they have been optimised to capture reality comprehensively and accurately. We ought to acknowledge that our knowledge of reality is inherently uncertain. To see clearly, we ought to seek error correction through creatively pursuing good explanations.
Science is a human process and as such, it is unsurprising that it contains bias and dogma. It is a well-established fact that humans are far from being optimally rational agents. Science helped us move away from the tyranny of the Church, but it didn’t eliminate dogma; it merely replaced the Church with the scientific establishment. These problems don’t mean that science is bad in principle. However, the way a lot of science is done today is dogmatic, thus potentially closing off the growth of knowledge in many fields.
Prejudice
Explanations in science traditionally took a reductionist approach. Such an approach claims that to have a complete explanation of what is going on at the higher levels of abstraction, one must understand what is happening on lower levels. For example, to understand humans, one needs to understand their biological organs. Understanding the organs entails understanding cells, then biochemistry, physical chemistry, physics, and all the way down to fundamental physics. This quote by Douglas Hofstadter captures this prejudice beautifully:
Saying that studying the brain is limited to the study of physical entities would be like saying that literary criticisms must focus on paper and bookbinding, ink and its chemistry, page sizes, and margin widths, typefaces, and paragraph lengths, and so forth. But what about the high abstractions that are the heart of literature—plot and character, style and point of view, irony and humour, allusion and metaphor, empathy and distance, and so on? Where did these crucial essences disappear in the list of topics for literary critics?
Reductionism is a prejudice. It is historically understandable because the physical sciences developed fastest, and it so happens that some of the best explanations in physics have been bottom-up. For example, space and time, elementary particles, and so on. But it has never been the case, even within physics, let alone other sciences, that all good explanations are reductionist. For example, the Theory of Evolution has achieved immense success without dealing with atoms.
Modes of Explanations
The quest for good explanations implies that we must not have the reductionist prejudice. If we do find an explanation that is on a higher level and it is a good explanation—provides hard-to-vary assertions about reality—then it is simply irrational to reject it just because it doesn’t have the reductionist form. We have been historically taught that reductionist explanations are the kind of explanations we should pursue. Still, by deeply understanding the power of good explanations, we become more open to different modes of explanation.
It is nearly always the case that whenever someone finds a new and much deeper theory, then it is not only a better explanation, but it is also a different mode of explanation. For example, in physics, Einstein’s explanation of gravity in curved space-time was a new mode of explanation. Relativity was not merely a tweak on Newtonian gravity, e.g. instead of an inverse square law had an inverse cube law. Relativity was a different kind of explanation altogether. It explains that space and time—which Newton’s theory regards as immutable background entities—are a dynamical space-time object which bucks and weaves and explains all sorts of things apart from just the motion of planets.
Science and Humanity
Science appears to be largely a story of us fighting our way past anthropocentrism, this notion that we are at the centre of things. We are not special; we share more than half our genes with a banana. This notion is the principle of mediocrity. Deutsch believes that this is literally true but nevertheless believes that we are central to any understanding of the universe.
First, if you think of that chemical scum, namely us, and possibly other conscious, intelligent beings, then to study that scum fully is impossible. Unlike every other scum in the universe, this scum is creating new knowledge, and the growth of knowledge is profoundly unpredictable. As a consequence of that, to understand this scum—never mind predict—but to understand, to understand what’s happening here, entails understanding everything that is happening in the universe. The growth of knowledge is profoundly unpredictable because if we could predict it ahead of time, we can invent future inventions now, yet we cannot. To predict the future perfectly, we ought to simulate the future perfectly, and to simulate the future, we ought to simulate the universe, and the best we can do to simulate the universe is to watch the universe unfold in real-time.
Second, the other way around is that the reach of human knowledge and human intentions on the physical world is unlimited. We are only used to having a relatively tiny effect on this small insignificant planet and the rest of the universe to be completely beyond our ken, but that is just a parochial misconception. We know that there are no limits (by the universality of computation) on how much we can affect the universe if we choose to.
We, and all other conscious, intelligent beings, are completely central to any understanding of the universe.
Remark: It is important to define what Deutsch means by computation. Computation—within any laws of physics—is the instantiation of abstract objects and their relationships using physical objects and their motions and interactions. Computation is universal because we can create a computer within the universe that can simulate any physical process.
The goal of science isn’t to deem humanity worthless. Experience plays an important role in science. Our knowledge is theory-laden, meaning there is no such thing as the raw comprehensive, accurate experience of reality—all our experience of the world comes through layers of conscious and unconscious interpretation. The role of human experience in science is to guess new conjectures and choose between conjectures that have already been guessed. That is what learning from experience is about.
Conclusion
All progress comes from the quest for good explanations—hard-to-vary assertions about reality. There isn’t an authoritative source of knowledge. Still, we can use this process of seeking good explanations through conjectures alternating with criticism to grind out knowledge about reality that is sufficiently reliable for us to treat as provisionally true and act upon.
- An LLM-based “exemplary actor” by 29 May 2023 11:12 UTC; 16 points) (
- 15 Jun 2023 22:31 UTC; 3 points) 's comment on I still think it’s very unlikely we’re observing alien aircraft by (
Source? I’ve never heard of such a telescope. All the modern telescopes I’ve ever used have solid, fixed mirrors. I thought the way to get around atmospheric distortions is to put your telescope in orbit.
Source (emphasis added by me):
Cool!
Deutsch is really opposed to induction , though.
Thank you for pointing this out, by the way. This is an important nuance. I just read this: Simple refutation of the ‘Bayesian’ philosophy of science.
And I am now really confused and conflicted. I would love it if someone could enlighten me on how Deutsch’s definition of explanation (hard-to-vary assertions about reality) and Bayesian probability conflict with each other. I am missing something very subtle here.
For context, I am aware of Popper and falsification, but wouldn’t a theory eventually become practically falsified within Bayesian updating if there is enough evidence against it?
I read that too a some time ago and he makes a really basic error, which made me lose some respect for him (If I was able to catch that error surely he should have, and if he didn’t, then he should have heard a correction and corrected it by now).
The error is the assumption that what Bayes does is compare between H and !H, or to take his example, ‘the sun is powered by nuclear fusion’ VS ‘the sun is not powered by nuclear fusion’. What the math really says you should do, is compare all possible hypothesis, so the term !H isn’t itself an explanation/hypothesis, it’s the sum of all other explanations/hypotheses.
I think Abram Demski (which, unlike me, is actually qualified to talk about this stuff) talked about this error in Bayes’ Law is About Multiple Hypothesis Testing (though not directly referring to Deutsch).
I don’t know if Bayes and Deutsch view of explanation actually conflict. It feels to me like he kinda wants them to conflict.
Wow, this is honestly baffling. It sounds as if Deutsch doesn’t know about the generalised form of Bayes’ theorem (I’m sure he does know, which makes me feel worse).
P(Hi|E)=P(E|Hi)P(Hi)ΣjP(E|Hj)P(Hj)You make an excellent point. Bayes’ theorem can be applied to all possible hypotheses, not just H and ¬H.
If a top physicist can be this biased, then I cannot be surprised by anything anymore.
Thank you very much for your response Yoav Ravid.
Bayes can explain why negative, disconfirmatory evidence counts more than positive support, and so sport a version of falsificationism. But it can’t rule out positive support, so doesn’t imply the more extreme Popperian doctrine that there is no justification.
A hard-to-vary explanation is a minimal explanation, one with no redundant parts. So hardness-to-vary is a simplicity criterion, a form of Occam’s razor. Compared to the simplicity criterion favoured by Bayesians, programme length, it is rather subjective. Neither criterion answers the hard problem,the problem of why simplicity implies truth. But Deutsch is more interested in Knowledge , which is left very vaguely defined.
In theory, Bayes is is about adjusting the credences of Every Possible Hypothesis. In practice, you don’t know every possible hypothesis, so there is some truth to Deutch’s claim that not-H is a blob … you might be able to locate some hypotheses other than H, but you have no chance of specifying all infinity.
Bayesians tend to be incurious about where hypotheses come from. That’s one of Chapman’s criticisms, that Bayes isn’t a complete epistemology because it can’t generate hypotheses. Popperians , by contrast, put a lot of emphasis on hypothesis-formation as a an informal, non-mechanistic process.
Good points. There were several chapters in Rationality: A-Z dedicating to this. According to Max Tegmark’s speculations, all mathematically possible universes exist, and we happen to be in one described by a simple Standard Model. I suspect that this question about why simple explanations are so effective in this universe is unanswerable but still fun to speculate about.
Good points about the lack of emphasis on hypothesis-formation within the Bayesian paradigm. Eliezer talks about this a little in Do Scientists Already Know This Stuff?
I long for a deeper treatment on hypothesis-formation. Any good books on that?
What does “effective” mean? If you are using a simplicity criterion to decide between theories that already known to be predictive , as in Solomonoff induction, then simplicity doesn’t buy you any extra predictiveness.
Oh yes, I didn’t mention the differences between the worldview presented in Rationality: A-Z and that of David Deutsch.
For example, Deutsch is strongly opposed to the dogmatic nature of Empiricism, which is the sixth virtue of rationality in the LessWrong worldview. My take is that Deutsch believes that explanatory theories are more foundational to our understanding of reality than our experiences or observations. He asserts that we interpret our experiences and observations of reality through explanatory theories. He further asserts that experiences and observations are not the sources of our theories. For example, Einstein came up with Relativity with no direct observational data, Einstein didn’t use the perihelion precession of Mercury. Instead, experiences and observations are what we use to judge competing explanatory theories.
I don’t feel too strongly either way at this point in my journey. I think Deutsch makes a good point, but so does Eliezer. I will probably start to feel more strongly about this in one direction or the other as I study more science.
Whenever I find myself in a situation where I’m around people arguing about -isms or definitions, I usually find that the meaningful parts of the disagreement get hidden in the small words in the sentences. Like when I try to find a concise definition of empiricism, I’m told it’s that “all knowledge is derived from sense-experience.” Well, what does “derived from” mean? That phrase can easily include all of epistemic rationality. What does “all” mean? Obviously some level of information comes from our genes instead, but is that “knowledge”? And is knowledge quantitative or categorical? What is sense-experience? Does it include every bit of physical or chemical information that affects our biology from the moment of conception, or only what registers to our conscious awareness through the traditional five senses, or something else?
In other words, I’m saying it’s very important that EY labeled the sixth virtue “empiricism,” and not “Empiricism.” That capital “E” can hide a lot of assumptions. And, of course, the he labeled empiricisim the sixth virtue, after argument and four others. I’m also saying that in many of the cases where the structure of language forces us to use words as if they drew fairly firm boundaries, the underlying reality is often continuous and nebulous.
In a literal sense, Eliezer said, “The roots of knowledge are in observation.” If we took this statement in isolation to Deutsch, he would vehemently disagree and tell us, “No, we interpret observations through explanatory theories.” However, I don’t think Eliezer and Deutsch disagree here. Both agree that there is a map and a territory and that the map comprises models, i.e., explanatory theories.
This isn’t quite right. The tiny probability of an observation given the hypothesis does not imply that the posterior of the hypothesis will be low. Suppose there’s a lottery with 10 million tickets. We have very good reasons to believe the lottery is fair. Still, whoever the winner X is, P(X is the winner|The lottery is fair) = 1/10000000. The reason P(The lottery is fair|X is the winner) is not low is that the alternative hypothesis “The lottery is not fair” also does a poor job at predicting the result (why rigged in favor of X specifically and not the other 9999999 people?) and the prior on P(The lottery is not fair) is very low. Ok, but what about the hypothesis “The lottery is 100% rigged in favor of X”? The probability that X is the winner given this alternative is 1. But the prior on that hypothesis is basically zero, so it doesn’t matter. (Things are different if we have reasons to think X is suspicious. Then the fact that X won is a good reason to suspect the lottery isn’t fair.)
tl;dr: The posterior P(H1|E) is tiny iff P(H1)P(E|H1) is tiny relative to all other P(Hi)P(E|Hi).
I agree with you here. I made a mistake but on the bright side, I learnt a lot about the generalised form of Bayes’ theorem which applies to all possible hypotheses. This was also how Eliezer explained this relationship between the posterior and the numerator in Decoherence is Falsifiable and Testable. I was trying to simplify the relationship between Bayes’ theorem and Deutsch’s criterion for good explanations for the sake of the post but I oversimplified too much.
I still think that Bayes’ theorem and Deutsch’s criterion for good explanation are compatible and in a practical sense, one can be explained in terms of the other but, using the generalised form of Bayes is necessary.
I updated my post to explain that this part is slightly incorrect.
It seems that he makes the same mistake in that post (though he makes it clear in the rest of the essay that alternatives matter). You paraphrased him right.
Incidentally, Popper also thought that you couldn’t falsify a theory unless we have a non-ad hoc alternative that explains the data better.
This is so interesting. Do you know where I can read more about this? Conjectures and Refutations?
I think focusing on the phenomenon of “explanation” is pretty helpful not just in science but also in philosophy—there are lots of places where people say they want an explanation for this or that thing, but what they mean can vary from case to case and person to person. But for this more general sort of explanation, I don’t think the definition of “hard to vary model of the world” works, there needs to be more of a social / psychological perspective.
This is more apparent for some people I know than others. (As the joke goes)
I tend to agree. It isn’t easy to generalise what entails a successful explanation, especially as one goes higher up the layers of abstraction (as you’ve put it) or further out to the more infeasibly testable realm.
What do you think is an elegant way to define the phenomenon of explanation that is more general than “hard-to-vary assertions about reality”?
I’m not sure there’s a neat form. Consider the explanation of why a mirror flips left and right but not up and down. Maxwell’s equations predict mirrors just fine, but it’s certainly not what people (well, most people) want from this explanation. Even if we try to be elegant we’ll probably have yo say complicated words like “the listener’s understanding”.
Related: A previous LessWrong review of The Beginning of Infinity
This is a fascinating critique of David Deutsch
and The Beginning of Infinityby one of his former colleagues.It is ironic that Deutch sees himself as an expert on counter-dogma, yet he is dogmatic about his convictions. Cultish Countercultishness springs to mind.
That link seems much more a critique of Deutsch than The Beginning of Infinity. Except the part on misquotations, which is actually its own post.
I agree, it is more a critique of Deutsch as a person than of the book. I still think it is a good book overall.