Um… no it isn’t? A Bayesian processes evidence the same way whether or not it’s labeled “science”.
If you’re talking about the word “science” as some sort of FDA seal of approval, invented so people can quickly see who to trust without examining the claims in detail, then I see no reason to exclude math. Do you think math gives less reliable conclusions than empirical disciplines?
A Bayesian may process probabilities the same way, but information is not evaluated the same way. Determining that a piece of information was derived scientifically does not provide a “seal of approval”, it tells us how to evaluate the likelihood of that information being true.
For instance, if I know that a piece of information was derived via scientific methods, I know to look at related studies. A single study is never definitive, because science involves reproducible results based on empirical evidence. Further studies may alter my understanding of the information the first study produced.
On the other hand, if I know that a piece of information was derived mathematically, I need only look at a single proof. If the proof is sound, I know that the premises lead inexorably to the conclusion. On the other hand, encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian—a new proof must be created. On the other hand, experiments may yield some useful evidence even if the study has flawed premises or methods; precisely what parts are useful requires an understanding of what science is.
So this is actually important—it’s not just a matter of definitions.
Thanks, that’s a valid argument that I didn’t think of.
But it’s sorta balanced by the fact that a lot of established math is really damn established. For example, compare Einstein’s general relativity with Brouwer’s fixed point theorem. Both were invented at about the same time, both are really important and have been used by lots and lots of people. Yet I think Brouwer’s theorem is way more reliable and less likely to be overturned than general relativity, and I’m not sure if anyone anywhere thinks otherwise.
I’m not sure if “overturning” general relativity is the appropriate description. We may well find a broader theory which contains general relativity as a limiting case, just as general relativity has special relativity and Newtonian mechanics as limiting cases. With the plethora of experimental verifications of general relativity, however, I wouldn’t expect to see it completely discarded in the way that, e.g., phlogiston theory was.
Oh, I’m not calling mathematics more or less reliable than science. I’m saying that the ways in which one would overturn an established useful theorem would be very different from the ways in which one would overturn an established scientific theory. Another way in which mathematics is more reliable is that bias is irrelevant. Scientists have to disclose their conflicts of interest because it’s easy for those conflicts to interfere with their objectivity during data collection or analysis, and so others must pay special attention. Mathematicians don’t need to because all their work can be contained in one location, and can be checked in a much more rigorous fashion.
On the other hand, if I know that a piece of information was derived mathematically, I need only look at a single proof. If the proof is sound, I know that the premises lead inexorably to the conclusion. On the other hand, encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian—a new proof must be created. On the other hand, experiments may yield some useful evidence even if the study has flawed premises or methods; precisely what parts are useful requires an understanding of what science is.
This doesn’t follow. If for example, one does have a single proof and one encounters a hole in it and the hole looks like it makes plausible assumptions then one should still increase one’s confidence that the claim is true. Thus, physicists are very fond of assuming that terms in series are of lower order even when they can’t actually prove it. Very often, under reasonable assumptions, their claims are correct. To use a specific example, Kempe’s “proof” of the four color theorem had a hole and so a repaired version could only prove that planar maps require at most five colors. But, the general thrust of the argument provided a strong plausibility heuristic for believing the claim as a whole.
Similarly, from a Bayesian stand-point, seeing multiple distinct proofs of a claim should make one more confident in the claim since even if one of the proofs has an unseen flaw, the others are likely to go through.
(There are complicating factors here. No one seems to have a good theory of confidence for mathematical statements which allows for objective priors since most standard objective priors (such as those based on some notion of computability) only make sense if one can perform arbitrary calculations correctly. Similarly it isn’t clear how one meaningfully can talk about say the probability that Peano arithmetic is consistent.)
I don’t think we actually disagree at all. Your “hole” is really the introduction of additional premises. If the premises are true and the reasoning sound, the conclusions follow. If they are shown to be untrue, you can discard the conclusion. Mathematics rarely has a way to evaluating the likelihood its premises are true—usually the best it can do is to show that certain premises are or are not compatible with one another.
What you are saying regarding multiple distinct proofs of a claim is true according to some informal logic, but not in any strict mathematical sense. Mathematically, you’ve either proven something or you haven’t. Mathematicians may still be convinced by scientific, theologic, literary, financial, etc. arguments of course.
I don’t think we actually disagree at all. Your “hole” is really the introduction of additional premises.
Not really. Consider for example someone who has seen Kempe’s argument. They should have a higher confidence that say “The four color theorem is true in ZFC” then someone who has not seen Kempe’s argument. There’s no additional premise being added but Kempe’s argument is clearly wrong.
What you are saying regarding multiple distinct proofs of a claim is true log
Not sure what you mean here. It looks like the sentence was cut off?
Would you mind explain in a little more detail why you say a person who has seen Kempe’s flawed proof should have higher confidence than one who has not? Do you mean that it’s so emotionally compelling that one’s mind is convinced even if the math doesn’t add up? Or that the required (previously-hidden) premise that allows Kempe to ignore the degree 5 vertex has some possibility of truth, so that the conclusion has an increased likelihood of truth?
Explain better why you say a person who has seen Kempe’s flawed proof should have higher confidence than one who has not.
Hmm, I’m not sure how to do so without just going through the whole proof. Essentially, Kempe’s proof showed that a smallest counterexample graph couldn’t have certain properties. One part of the proof was showing that the graph could not contain a vertex of degree 5. But this part was flawed. But Kempe did show that it couldn’t contain a vertex of degree 4, and moreover, it showed that any minimal counterexample must have a vertex of degree 5. This makes us more confident in the original claim since a minimal counterexample has to have a very restricted looking form.
Replying to the fixed end here so as to minimize confusion:
What you are saying regarding multiple distinct proofs of a claim is true according to some informal logic, but not in any strict mathematical sense. Mathematically, you’ve either proven something or you haven’t. Mathematicians may still be convinced by scientific, theologic, literary, financial, etc. arguments of course.
Well, yes but the claim I was addressing was that the claim you made that “encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian” which is wrong. I agree that a flawed proof is not a proof.
And yes, the logic is in any case informal. See my earlier parenthetical remark. I actually consider the problem of confidence in mathematical reasoning to be one of the great difficult open problems within Bayesianism. One reason I don’t (generally) self-identify as a Bayesian is due to an apparent lack of this theory. (This itself deserves a disclaimer that I’m by no means at all an expert in this field and so there may be work in this direction but if so I haven’t seen any that is at all satisfactory.)
“encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian” which is wrong
I think you are assuming I count a dubious premise as an incorrect premise.
Obviously, a merely dubious premise allows the conclusion to have some utility to the Bayesian.
I think you are assuming I count a dubious premise as an incorrect premise. Obviously, a merely dubious premise allows the conclusion to have some utility to the Bayesian.
Really? Even incorrect premises can be useful. For example, one plausibility argument for the Riemann hypothesis rests on assuming that the Mobius function behaves like a random variable. But that’s a false statement. Nevertheless, it acts close enough to being a random variable that many find this argument to be evidence for RH. And there’s been very good work trying to take this false statement and make true versions of it.
Similarly, if one believes what you have said then one would have to conclude that if one lived in the 1700s that all of calculus would have been useless because it rests on the notion of infinitesimals which didn’t exist. The premise was incorrect, but the results were sound.
Incidentally, as more evidence, apparently this AC0 conjecture has just been proved true by Ben Green (rather, he noticed that other people had already done stuff that had this as a consequence, which the people asking the question hadn’t known about).
Ok, I need to refine my description of math a bit. I’d claimed that an incorrect premise gives useless conclusions; actually as you point out if we have a close-to-correct premise instead, we can have useful conclusions. The word “instead” is important there, because otherwise we can then add in a correct contradictory premise, generating new and false conclusions. In some sense this is necessary to all math, most evidently geometry: we don’t actually have any triangles in the world, but we use near-triangles all the time, pretending they’re triangles, with great utility.
Also, to look again at Kempe’s “proof”: we can see where we can construct a vertex of degree 5 where his proof does not hold up. And we can try to turn that special case back into a map. The fact that nobody’s managed to construct an actual map relying on that flaw does not give any mathematical evidence that an example can’t exist. Staying within the field of math, the Bayesian is not updated and we can discard his conclusion. But we can step outside math’s rules and say “there’s a bunch of smart mathematicians trying to find a counterexample, and Kempe shows them exactly where the counterexample would have to be, and they can’t find one.” That fact updates the Bayesian, but reaches outside the field of math. The behavior of mathematicians faced by a math problem looks like part of mathematics, but actually isn’t.
A single study is never definitive, because science involves reproducible results based on empirical evidence.
That simply doesn’t follow: why does involving reproducible results imply not being definitive?
Empirical results are never ‘definitive’ as in being 100.0% certain, but we can get very close. Whether this is done in a single study or with multiple studies doesn’t matter at all. In practice there are good reasons to want multiple studies, but they have more to do with questions not addressed in a single study, trustworthiness of the authors, etc.
On the other hand, encountering a single incorrect premise or step means that the conclusion has zero utility
Even wrong mathematical proofs have a non-zero utility, because they often lead to new insights. For example, if only the last of 100 steps is wrong, then you are 99 steps closer to some goal.
A single study can’t get close to 100% certainty, because that’s just not how science works. If you look at all the studies that were true with 95% certainty, you’ll find that well over 5% have found conclusions now believed to be false. There are issues of trust, issues of data collection errors, issues of statistical evaluation, the fact that scientific methods are designed under the assumption that studies will be repeated, etc.
The steps within unsound mathematical proofs may be valuable, but their conclusions are not.
A single study can’t get close to 100% certainty, because that’s just not how science works. … the fact that scientific methods are designed under the assumption that studies will be repeated, etc.
The current scientific method is in no way ideal. If a study were properly Bayesian, then you should be able to confidently learn from its results. That still leaves issues of trust and the possibility of human error, but there might also be ways to combat those. But in a human society, repeating studies is perhaps the best thing one can hope for.
The steps within unsound mathematical proofs may be valuable, but their conclusions are not.
Agreed. That is the one part of an unsound proof that is useless.
Can you describe a better, more Bayesian scientific method? The main way I would change it is to increase the number of studies that are repeated, to improve the accuracy of our knowledge. How would you propose to improve our confidence other than by showing that an experiment has reproducible results?
Which is why mathematics isn’t science.
I sense an argument about definitions of words. Please don’t.
“what is science” is not a mere matter of definitions. It’s fundamental to how we decide how certain we are of various propositions.
Um… no it isn’t? A Bayesian processes evidence the same way whether or not it’s labeled “science”.
If you’re talking about the word “science” as some sort of FDA seal of approval, invented so people can quickly see who to trust without examining the claims in detail, then I see no reason to exclude math. Do you think math gives less reliable conclusions than empirical disciplines?
A Bayesian may process probabilities the same way, but information is not evaluated the same way. Determining that a piece of information was derived scientifically does not provide a “seal of approval”, it tells us how to evaluate the likelihood of that information being true.
For instance, if I know that a piece of information was derived via scientific methods, I know to look at related studies. A single study is never definitive, because science involves reproducible results based on empirical evidence. Further studies may alter my understanding of the information the first study produced.
On the other hand, if I know that a piece of information was derived mathematically, I need only look at a single proof. If the proof is sound, I know that the premises lead inexorably to the conclusion. On the other hand, encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian—a new proof must be created. On the other hand, experiments may yield some useful evidence even if the study has flawed premises or methods; precisely what parts are useful requires an understanding of what science is.
So this is actually important—it’s not just a matter of definitions.
Thanks, that’s a valid argument that I didn’t think of.
But it’s sorta balanced by the fact that a lot of established math is really damn established. For example, compare Einstein’s general relativity with Brouwer’s fixed point theorem. Both were invented at about the same time, both are really important and have been used by lots and lots of people. Yet I think Brouwer’s theorem is way more reliable and less likely to be overturned than general relativity, and I’m not sure if anyone anywhere thinks otherwise.
I’m not sure if “overturning” general relativity is the appropriate description. We may well find a broader theory which contains general relativity as a limiting case, just as general relativity has special relativity and Newtonian mechanics as limiting cases. With the plethora of experimental verifications of general relativity, however, I wouldn’t expect to see it completely discarded in the way that, e.g., phlogiston theory was.
Oh, I’m not calling mathematics more or less reliable than science. I’m saying that the ways in which one would overturn an established useful theorem would be very different from the ways in which one would overturn an established scientific theory. Another way in which mathematics is more reliable is that bias is irrelevant. Scientists have to disclose their conflicts of interest because it’s easy for those conflicts to interfere with their objectivity during data collection or analysis, and so others must pay special attention. Mathematicians don’t need to because all their work can be contained in one location, and can be checked in a much more rigorous fashion.
This doesn’t follow. If for example, one does have a single proof and one encounters a hole in it and the hole looks like it makes plausible assumptions then one should still increase one’s confidence that the claim is true. Thus, physicists are very fond of assuming that terms in series are of lower order even when they can’t actually prove it. Very often, under reasonable assumptions, their claims are correct. To use a specific example, Kempe’s “proof” of the four color theorem had a hole and so a repaired version could only prove that planar maps require at most five colors. But, the general thrust of the argument provided a strong plausibility heuristic for believing the claim as a whole.
Similarly, from a Bayesian stand-point, seeing multiple distinct proofs of a claim should make one more confident in the claim since even if one of the proofs has an unseen flaw, the others are likely to go through.
(There are complicating factors here. No one seems to have a good theory of confidence for mathematical statements which allows for objective priors since most standard objective priors (such as those based on some notion of computability) only make sense if one can perform arbitrary calculations correctly. Similarly it isn’t clear how one meaningfully can talk about say the probability that Peano arithmetic is consistent.)
I don’t think we actually disagree at all. Your “hole” is really the introduction of additional premises. If the premises are true and the reasoning sound, the conclusions follow. If they are shown to be untrue, you can discard the conclusion. Mathematics rarely has a way to evaluating the likelihood its premises are true—usually the best it can do is to show that certain premises are or are not compatible with one another. What you are saying regarding multiple distinct proofs of a claim is true according to some informal logic, but not in any strict mathematical sense. Mathematically, you’ve either proven something or you haven’t. Mathematicians may still be convinced by scientific, theologic, literary, financial, etc. arguments of course.
Not really. Consider for example someone who has seen Kempe’s argument. They should have a higher confidence that say “The four color theorem is true in ZFC” then someone who has not seen Kempe’s argument. There’s no additional premise being added but Kempe’s argument is clearly wrong.
Not sure what you mean here. It looks like the sentence was cut off?
Would you mind explain in a little more detail why you say a person who has seen Kempe’s flawed proof should have higher confidence than one who has not? Do you mean that it’s so emotionally compelling that one’s mind is convinced even if the math doesn’t add up? Or that the required (previously-hidden) premise that allows Kempe to ignore the degree 5 vertex has some possibility of truth, so that the conclusion has an increased likelihood of truth?
also: fixed the end.
Hmm, I’m not sure how to do so without just going through the whole proof. Essentially, Kempe’s proof showed that a smallest counterexample graph couldn’t have certain properties. One part of the proof was showing that the graph could not contain a vertex of degree 5. But this part was flawed. But Kempe did show that it couldn’t contain a vertex of degree 4, and moreover, it showed that any minimal counterexample must have a vertex of degree 5. This makes us more confident in the original claim since a minimal counterexample has to have a very restricted looking form.
Replying to the fixed end here so as to minimize confusion:
Well, yes but the claim I was addressing was that the claim you made that “encountering a single incorrect premise or step means that the conclusion has zero utility to the Bayesian” which is wrong. I agree that a flawed proof is not a proof.
And yes, the logic is in any case informal. See my earlier parenthetical remark. I actually consider the problem of confidence in mathematical reasoning to be one of the great difficult open problems within Bayesianism. One reason I don’t (generally) self-identify as a Bayesian is due to an apparent lack of this theory. (This itself deserves a disclaimer that I’m by no means at all an expert in this field and so there may be work in this direction but if so I haven’t seen any that is at all satisfactory.)
I think you are assuming I count a dubious premise as an incorrect premise. Obviously, a merely dubious premise allows the conclusion to have some utility to the Bayesian.
I really don’t think we actually disagree.
Really? Even incorrect premises can be useful. For example, one plausibility argument for the Riemann hypothesis rests on assuming that the Mobius function behaves like a random variable. But that’s a false statement. Nevertheless, it acts close enough to being a random variable that many find this argument to be evidence for RH. And there’s been very good work trying to take this false statement and make true versions of it.
Similarly, if one believes what you have said then one would have to conclude that if one lived in the 1700s that all of calculus would have been useless because it rests on the notion of infinitesimals which didn’t exist. The premise was incorrect, but the results were sound.
Incidentally, as more evidence, apparently this AC0 conjecture has just been proved true by Ben Green (rather, he noticed that other people had already done stuff that had this as a consequence, which the people asking the question hadn’t known about).
Ok, I need to refine my description of math a bit. I’d claimed that an incorrect premise gives useless conclusions; actually as you point out if we have a close-to-correct premise instead, we can have useful conclusions. The word “instead” is important there, because otherwise we can then add in a correct contradictory premise, generating new and false conclusions. In some sense this is necessary to all math, most evidently geometry: we don’t actually have any triangles in the world, but we use near-triangles all the time, pretending they’re triangles, with great utility.
Also, to look again at Kempe’s “proof”: we can see where we can construct a vertex of degree 5 where his proof does not hold up. And we can try to turn that special case back into a map. The fact that nobody’s managed to construct an actual map relying on that flaw does not give any mathematical evidence that an example can’t exist. Staying within the field of math, the Bayesian is not updated and we can discard his conclusion. But we can step outside math’s rules and say “there’s a bunch of smart mathematicians trying to find a counterexample, and Kempe shows them exactly where the counterexample would have to be, and they can’t find one.” That fact updates the Bayesian, but reaches outside the field of math. The behavior of mathematicians faced by a math problem looks like part of mathematics, but actually isn’t.
That simply doesn’t follow: why does involving reproducible results imply not being definitive?
Empirical results are never ‘definitive’ as in being 100.0% certain, but we can get very close. Whether this is done in a single study or with multiple studies doesn’t matter at all. In practice there are good reasons to want multiple studies, but they have more to do with questions not addressed in a single study, trustworthiness of the authors, etc.
Even wrong mathematical proofs have a non-zero utility, because they often lead to new insights. For example, if only the last of 100 steps is wrong, then you are 99 steps closer to some goal.
A single study can’t get close to 100% certainty, because that’s just not how science works. If you look at all the studies that were true with 95% certainty, you’ll find that well over 5% have found conclusions now believed to be false. There are issues of trust, issues of data collection errors, issues of statistical evaluation, the fact that scientific methods are designed under the assumption that studies will be repeated, etc.
The steps within unsound mathematical proofs may be valuable, but their conclusions are not.
The current scientific method is in no way ideal. If a study were properly Bayesian, then you should be able to confidently learn from its results. That still leaves issues of trust and the possibility of human error, but there might also be ways to combat those. But in a human society, repeating studies is perhaps the best thing one can hope for.
Agreed. That is the one part of an unsound proof that is useless.
Can you describe a better, more Bayesian scientific method? The main way I would change it is to increase the number of studies that are repeated, to improve the accuracy of our knowledge. How would you propose to improve our confidence other than by showing that an experiment has reproducible results?