It would be more accurate to say that LW-style Bayesians consider falsificationism to be subsumed under Bayesianism as a sort of limiting case. Falsificationism as originally stated (ie, confirmations are irrelevant; only falsifications advance knowledge) is an exaggerated version of a mathematically valid claim. From An Intuitive Explanation of Bayes’ Theorem:
Previously, the most popular philosophy of science was probably Karl Popper’s falsificationism—this is the old philosophy that the Bayesian revolution is currently dethroning. Karl Popper’s idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules; if p(X|A) ~ 1 - if the theory makes a definite prediction—then observing ~X very strongly falsifies A. On the other hand, if p(X|A) ~ 1, and we observe X, this doesn’t definitely confirm the theory; there might be some other condition B such that p(X|B) ~ 1, in which case observing X doesn’t favor A over B. For observing X to definitely confirm A, we would have to know, not that p(X|A) ~ 1, but that p(X|~A) ~ 0, which is something that we can’t know because we can’t range over all possible alternative explanations. For example, when Einstein’s theory of General Relativity toppled Newton’s incredibly well-confirmed theory of gravity, it turned out that all of Newton’s predictions were just a special case of Einstein’s predictions.
You can even formalize Popper’s philosophy mathematically. The likelihood ratio for X, p(X|A)/p(X|~A), determines how much observing X slides the probability for A; the likelihood ratio is what says how strong X is as evidence. Well, in your theory A, you can predict X with probability 1, if you like; but you can’t control the denominator of the likelihood ratio, p(X|~A) - there will always be some alternative theories that also predict X, and while we go with the simplest theory that fits the current evidence, you may someday encounter some evidence that an alternative theory predicts but your theory does not. That’s the hidden gotcha that toppled Newton’s theory of gravity. So there’s a limit on how much mileage you can get from successful predictions; there’s a limit on how high the likelihood ratio goes for confirmatory evidence.
[i]n your theory A, you can predict X with probability 1...
This seems the key step for incorporating falsification as a limiting case; I contest it. The rules of Bayesian rationality preclude assigning an a priori probability of 1 to a synthetic proposition: nothing empirical is so certain that refuting evidence is impossible. (Isthat assertion self-undermining? I hope that worry can be bracketed.) As long as you avoid assigning probabilities of 1 or 0 to priors, you will never get an outcome at those extremes.
But since P(X/A) is always “intermediate,” observing X will never strictly falsify A—which is a good thing because the falsification prong of Popperianism has proven at least as scientifically problematic as the nonverification prong.
I don’t think falsification can be squared with Bayes, even as a limiting case. In Basesian theory, verification and falsification are symmetric (as the slider metaphor really indicates). In principle, you can’t strictly falsify a theory empirically any more (or less) than you can verify one. Verification, as the quoted essay confirms, is blocked by the > 0 probability mandatorily assigned to unpredicted outcomes; falsification is blocked by the < 1 probability mandatorily assigned to the expected results. It is no less irrational to be certain that X holds given A than to be certain that X fails given not-A. You are no more justified in assuming absolutely that your abstractions don’t leak than in assuming you can range over all explanations.
In principle, you can’t strictly falsify a theory empirically any more (or less) than you can verify one.
This throws the baby out with the bathwater; we can falsify and verify to degrees. Refusing the terms verify and falsify because we are not able to assign infinite credence seems like a mistake.
This throws the baby out with the bathwater; we can falsify and verify to degrees. Refusing the terms verify and falsify because we are not able to assign infinite credence seems like a mistake.
I agree; that’s why “strictly.” But you seem to miss the point, which is that falsification and verification are perfectly symmetric: whether you call the glass half empty or half full on either side of the equation wasn’t my concern.
Two basic criticisms apply to Popperian falsificationism: 1) it ignores verification (although the “verisimilitude” doctrine tries to overcome this limitation); and 2) it does assign infinite credence to falsification.
No. 2 doesn’t comport with the principles of Bayesian inference, but seems part of LW Bayesianism (your term):
This allowance of a unitary probability assignment to evidence conditional on a theory is a distortion of Bayesian inference. The distortion introduces an artificial asymmetry into the Bayesian handling of verification versus falsification. It is irrational to pretend—even conditionally—to absolute certainty about an empirical prediction.
[i]n your theory A, you can predict X with probability 1...
[...] The rules of Bayesian rationality preclude assigning an a priori probability of 1 to a synthetic proposition: nothing empirical is so certain that refuting evidence is impossible.
We all agree on this point. Yudkowsky isn’t supposing that anything empirical has probability 1.
In the line you quote, Yudkowsky is saying that even if theory A predicts data X with probability 1 (setting aside the question of whether this is even possible), confirming that X is true still wouldn’t push our confidence in the truth of A past a certain threshold, which might be far short of 1. (In particular, merely confirming a prediction X of A can never push the posterior probability of A above p(A|X), which might still be too small because too many alternative theories also predict X). A falsification, on the other hand, can drive the probability of a theory very low, provided that the theory makes some prediction with high confidence (which needn’t be equal to 1) that has a low prior probability.
That is the sense in which it is true that falsifications tend to be more decisive than confirmations. So, a certain limited and “caveated”, but also more precise and quantifiable, version of Popper’s falsificationism is correct.
But since P(X/A) is always “intermediate,” observing X will never strictly falsify A—which is a good thing because the falsification prong of Popperianism has proven at least as scientifically problematic as the nonverification prong.
Yes, no observation will drive the probability of a theory down to precisely 0. The probability can only be driven very low. That is why I called falsificationism an “an exaggerated version of a mathematically valid claim”.
I don’t think falsification can be squared with Bayes, even as a limiting case. In Basesian theory, verification and falsification are symmetric (as the slider metaphor really indicates). In principle, you can’t strictly falsify a theory empirically any more (or less) than you can verify one. Verification, as the quoted essay confirms, is blocked by the > 0 probability mandatorily assigned to unpredicted outcomes; falsification is blocked by the < 1 probability mandatorily assigned to the expected results. It is no less irrational to be certain that X holds given A than to be certain that X fails given not-A. You are no more justified in assuming absolutely that your abstractions don’t leak than in assuming you can range over all explanations.
As you say, getting to probability 0 is as impossible as getting to probability 1. But getting close to probability 0 is easier than getting equally close to probability 1.
This asymmetry is possible because different kinds of propositions are more or less amenable to being assigned extremely high or low probability. It is relatively easier to show that some data has extremely high or low probability (whether conditional on some theory or a priori) than it is to show that some theory has extremely high conditional probability.
Fix a theory A. It is very hard to think up an experiment with a possible outcome X such that p(A | X) is nearly 1. To do this, you would need to show that no other possible theory, even among the many theories you haven’t thought of, could have a significant amount of probability, conditional on observing X.
It is relatively easy to think up an experiment with a possible outcome X, which your theory A predicts with very high probability, but which has very low prior probability. To accomplish this, you only need to exhibit some other a priori plausible outcomes different from X.
In the second case, you need to show that the probability of some data is extremely high a posteriori and extremely low a priori. In the first case, you need to show that the a posteriori probability of a theory is extremely high.
In the second case, you only need to construct enough alternative outcomes to certify your claim. In the first case, you need to prove a universal statement about all possible theories.
One root of the asymmetry is this: As hard as it might be to establish extreme probabilities for data, at least the data usually come from a reasonably well-understood parameter space (the real numbers, say). But the space of all possible theories is not well understood, at least not in any computationally tractable way.
In the second case, you only need to construct enough alternative outcomes to certify your claim. In the first case, you need to prove a universal statement about all possible theories.
All these arguments are at best suggestive. Our abductive capacities are such as to suggest that proving a universal statement about all possible theories isn’t necessarily hard. Your arguments, I think, flow from and then confirm a nominalistic bias: accept concrete data; beware of general theories.
There are universal statements known with greater certainly than any particular data, e.g., life evolved from inanimate matter and mind always supervenes on physics.
some universal statements about all theories are very probable, and that
some of our theories are more probable than any particular data.
I’m not seeing why either of these facts are in tension with my previous comment. Would you elaborate?
The claims I made are true of certain priors. I’m not trying to argue you into using such a prior. Right now I only want to make the points that (1) a Bayesian can coherently use a prior satisfying the properties I described, and that (2) falsificationism is true, in a weakened but precise sense, under such a prior.
It would be more accurate to say that LW-style Bayesians consider falsificationism to be subsumed under Bayesianism as a sort of limiting case. Falsificationism as originally stated (ie, confirmations are irrelevant; only falsifications advance knowledge) is an exaggerated version of a mathematically valid claim. From An Intuitive Explanation of Bayes’ Theorem:
This seems the key step for incorporating falsification as a limiting case; I contest it. The rules of Bayesian rationality preclude assigning an a priori probability of 1 to a synthetic proposition: nothing empirical is so certain that refuting evidence is impossible. (Isthat assertion self-undermining? I hope that worry can be bracketed.) As long as you avoid assigning probabilities of 1 or 0 to priors, you will never get an outcome at those extremes.
But since P(X/A) is always “intermediate,” observing X will never strictly falsify A—which is a good thing because the falsification prong of Popperianism has proven at least as scientifically problematic as the nonverification prong.
I don’t think falsification can be squared with Bayes, even as a limiting case. In Basesian theory, verification and falsification are symmetric (as the slider metaphor really indicates). In principle, you can’t strictly falsify a theory empirically any more (or less) than you can verify one. Verification, as the quoted essay confirms, is blocked by the > 0 probability mandatorily assigned to unpredicted outcomes; falsification is blocked by the < 1 probability mandatorily assigned to the expected results. It is no less irrational to be certain that X holds given A than to be certain that X fails given not-A. You are no more justified in assuming absolutely that your abstractions don’t leak than in assuming you can range over all explanations.
This throws the baby out with the bathwater; we can falsify and verify to degrees. Refusing the terms verify and falsify because we are not able to assign infinite credence seems like a mistake.
I agree; that’s why “strictly.” But you seem to miss the point, which is that falsification and verification are perfectly symmetric: whether you call the glass half empty or half full on either side of the equation wasn’t my concern.
Two basic criticisms apply to Popperian falsificationism: 1) it ignores verification (although the “verisimilitude” doctrine tries to overcome this limitation); and 2) it does assign infinite credence to falsification.
No. 2 doesn’t comport with the principles of Bayesian inference, but seems part of LW Bayesianism (your term):
This allowance of a unitary probability assignment to evidence conditional on a theory is a distortion of Bayesian inference. The distortion introduces an artificial asymmetry into the Bayesian handling of verification versus falsification. It is irrational to pretend—even conditionally—to absolute certainty about an empirical prediction.
We all agree on this point. Yudkowsky isn’t supposing that anything empirical has probability 1.
In the line you quote, Yudkowsky is saying that even if theory A predicts data X with probability 1 (setting aside the question of whether this is even possible), confirming that X is true still wouldn’t push our confidence in the truth of A past a certain threshold, which might be far short of 1. (In particular, merely confirming a prediction X of A can never push the posterior probability of A above p(A|X), which might still be too small because too many alternative theories also predict X). A falsification, on the other hand, can drive the probability of a theory very low, provided that the theory makes some prediction with high confidence (which needn’t be equal to 1) that has a low prior probability.
That is the sense in which it is true that falsifications tend to be more decisive than confirmations. So, a certain limited and “caveated”, but also more precise and quantifiable, version of Popper’s falsificationism is correct.
Yes, no observation will drive the probability of a theory down to precisely 0. The probability can only be driven very low. That is why I called falsificationism an “an exaggerated version of a mathematically valid claim”.
As you say, getting to probability 0 is as impossible as getting to probability 1. But getting close to probability 0 is easier than getting equally close to probability 1.
This asymmetry is possible because different kinds of propositions are more or less amenable to being assigned extremely high or low probability. It is relatively easier to show that some data has extremely high or low probability (whether conditional on some theory or a priori) than it is to show that some theory has extremely high conditional probability.
Fix a theory A. It is very hard to think up an experiment with a possible outcome X such that p(A | X) is nearly 1. To do this, you would need to show that no other possible theory, even among the many theories you haven’t thought of, could have a significant amount of probability, conditional on observing X.
It is relatively easy to think up an experiment with a possible outcome X, which your theory A predicts with very high probability, but which has very low prior probability. To accomplish this, you only need to exhibit some other a priori plausible outcomes different from X.
In the second case, you need to show that the probability of some data is extremely high a posteriori and extremely low a priori. In the first case, you need to show that the a posteriori probability of a theory is extremely high.
In the second case, you only need to construct enough alternative outcomes to certify your claim. In the first case, you need to prove a universal statement about all possible theories.
One root of the asymmetry is this: As hard as it might be to establish extreme probabilities for data, at least the data usually come from a reasonably well-understood parameter space (the real numbers, say). But the space of all possible theories is not well understood, at least not in any computationally tractable way.
All these arguments are at best suggestive. Our abductive capacities are such as to suggest that proving a universal statement about all possible theories isn’t necessarily hard. Your arguments, I think, flow from and then confirm a nominalistic bias: accept concrete data; beware of general theories.
There are universal statements known with greater certainly than any particular data, e.g., life evolved from inanimate matter and mind always supervenes on physics.
I agree that
some universal statements about all theories are very probable, and that
some of our theories are more probable than any particular data.
I’m not seeing why either of these facts are in tension with my previous comment. Would you elaborate?
The claims I made are true of certain priors. I’m not trying to argue you into using such a prior. Right now I only want to make the points that (1) a Bayesian can coherently use a prior satisfying the properties I described, and that (2) falsificationism is true, in a weakened but precise sense, under such a prior.