Infinite Certainty
In “Absolute Authority,” I argued that you don’t need infinite certainty:
If you have to choose between two alternatives A and B, and you somehow succeed in establishing knowably certain well-calibrated 100% confidence that A is absolutely and entirely desirable and that B is the sum of everything evil and disgusting, then this is a sufficient condition for choosing A over B. It is not a necessary condition . . . You can have uncertain knowledge of relatively better and relatively worse options, and still choose. It should be routine, in fact.
Concerning the proposition that 2 + 2 = 4, we must distinguish between the map and the territory. Given the seeming absolute stability and universality of physical laws, it’s possible that never, in the whole history of the universe, has any particle exceeded the local lightspeed limit. That is, the lightspeed limit may be not just true 99% of the time, or 99.9999% of the time, or (1 − 1/googolplex) of the time, but simply always and absolutely true.
But whether we can ever have absolute confidence in the lightspeed limit is a whole ’nother question. The map is not the territory.
It may be entirely and wholly true that a student plagiarized their assignment, but whether you have any knowledge of this fact at all—let alone absolute confidence in the belief—is a separate issue. If you flip a coin and then don’t look at it, it may be completely true that the coin is showing heads, and you may be completely unsure of whether the coin is showing heads or tails. A degree of uncertainty is not the same as a degree of truth or a frequency of occurrence.
The same holds for mathematical truths. It’s questionable whether the statement “2 + 2 = 4” or “In Peano arithmetic, SS0 + SS0 = SSSS0” can be said to be true in any purely abstract sense, apart from physical systems that seem to behave in ways similar to the Peano axioms. Having said this, I will charge right ahead and guess that, in whatever sense “2 + 2 = 4” is true at all, it is always and precisely true, not just roughly true (“2 + 2 actually equals 4.0000004”) or true 999,999,999,999 times out of 1,000,000,000,000.
I’m not totally sure what “true” should mean in this case, but I stand by my guess. The credibility of “2 + 2 = 4 is always true” far exceeds the credibility of any particular philosophical position on what “true,” “always,” or “is” means in the statement above.
This doesn’t mean, though, that I have absolute confidence that 2 + 2 = 4. See the previous discussion on how to convince me that 2 + 2 = 3, which could be done using much the same sort of evidence that convinced me that 2 + 2 = 4 in the first place. I could have hallucinated all that previous evidence, or I could be misremembering it. In the annals of neurology there are stranger brain dysfunctions than this.
So if we attach some probability to the statement “2 + 2 = 4,” then what should the probability be? What you seek to attain in a case like this is good calibration—statements to which you assign “99% probability” come true 99 times out of 100. This is actually a hell of a lot more difficult than you might think. Take a hundred people, and ask each of them to make ten statements of which they are “99% confident.” Of the 1,000 statements, do you think that around 10 will be wrong?
I am not going to discuss the actual experiments that have been done on calibration—you can find them in my book chapter on cognitive biases and global catastrophic risk1—because I’ve seen that when I blurt this out to people without proper preparation, they thereafter use it as a Fully General Counterargument, which somehow leaps to mind whenever they have to discount the confidence of someone whose opinion they dislike, and fails to be available when they consider their own opinions. So I try not to talk about the experiments on calibration except as part of a structured presentation of rationality that includes warnings against motivated skepticism.
But the observed calibration of human beings who say they are “99% confident” is not 99% accuracy.
Suppose you say that you’re 99.99% confident that 2 + 2 = 4. Then you have just asserted that you could make 10,000 independent statements, in which you repose equal confidence, and be wrong, on average, around once. Maybe for 2 + 2 = 4 this extraordinary degree of confidence would be possible: “2 + 2 = 4” is extremely simple, and mathematical as well as empirical, and widely believed socially (not with passionate affirmation but just quietly taken for granted). So maybe you really could get up to 99.99% confidence on this one.
I don’t think you could get up to 99.99% confidence for assertions like “53 is a prime number.” Yes, it seems likely, but by the time you tried to set up protocols that would let you assert 10,000 independent statements of this sort—that is, not just a set of statements about prime numbers, but a new protocol each time—you would fail more than once.2
Yet the map is not the territory: If I say that I am 99% confident that 2 + 2 = 4, it doesn’t mean that I think “2 + 2 = 4” is true to within 99% precision, or that “2 + 2 = 4” is true 99 times out of 100. The proposition in which I repose my confidence is the proposition that “2 + 2 = 4 is always and exactly true,” not the proposition “2 + 2 = 4 is mostly and usually true.”
As for the notion that you could get up to 100% confidence in a mathematical proposition—well, really now! If you say 99.9999% confidence, you’re implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once. That’s around a solid year’s worth of talking, if you can make one assertion every 20 seconds and you talk for 16 hours a day.
Assert 99.9999999999% confidence, and you’re taking it up to a trillion. Now you’re going to talk for a hundred human lifetimes, and not be wrong even once?
Assert a confidence of (1 − 1/googolplex) and your ego far exceeds that of mental patients who think they’re God.
And a googolplex is a lot smaller than even relatively small inconceivably huge numbers like 3 ↑↑↑ 3. But even a confidence of (1 − 1⁄3 ↑↑↑ 3) isn’t all that much closer to PROBABILITY 1 than being 90% sure of something.
If all else fails, the hypothetical Dark Lords of the Matrix, who are right now tampering with your brain’s credibility assessment of this very sentence, will bar the path and defend us from the scourge of infinite certainty.
Am I absolutely sure of that?
Why, of course not.
As Rafal Smigrodski once said:
I would say you should be able to assign a less than 1 certainty level to the mathematical concepts which are necessary to derive Bayes’s rule itself, and still practically use it. I am not totally sure I have to be always unsure. Maybe I could be legitimately sure about something. But once I assign a probability of 1 to a proposition, I can never undo it. No matter what I see or learn, I have to reject everything that disagrees with the axiom. I don’t like the idea of not being able to change my mind, ever.
1Eliezer Yudkowsky, “Cognitive Biases Potentially Affecting Judgment of Global Risks,” in Global Catastrophic Risks, ed. Nick Bostrom and Milan M. irkovi (New York: Oxford University Press, 2008), 91–119.
2Peter de Blanc has an amusing anecdote on this point: http://www.spaceandgames.com/?p=27. (I told him not to do it again.)
- Confidence levels inside and outside an argument by 16 Dec 2010 3:06 UTC; 235 points) (
- Trying to Try by 1 Oct 2008 8:58 UTC; 218 points) (
- Radical Probabilism by 18 Aug 2020 21:14 UTC; 182 points) (
- Message Length by 20 Oct 2020 5:52 UTC; 134 points) (
- Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think by 27 Dec 2019 5:09 UTC; 127 points) (
- Shut up and do the impossible! by 8 Oct 2008 21:24 UTC; 111 points) (
- Yes, Virginia, You Can Be 99.99% (Or More!) Certain That 53 Is Prime by 7 Nov 2013 7:45 UTC; 95 points) (
- The Parable of Hemlock by 3 Feb 2008 2:01 UTC; 91 points) (
- Unnatural Categories Are Optimized for Deception by 8 Jan 2021 20:54 UTC; 89 points) (
- 19 Nov 2021 18:21 UTC; 75 points) 's comment on Ngo and Yudkowsky on AI capability gains by (
- Perpetual Motion Beliefs by 27 Feb 2008 20:22 UTC; 75 points) (
- Reality is weirdly normal by 25 Aug 2013 19:29 UTC; 55 points) (
- The dangers of zero and one by 21 Nov 2013 12:21 UTC; 48 points) (
- Bell’s Theorem: No EPR “Reality” by 4 May 2008 4:44 UTC; 40 points) (
- Fundamental Doubts by 12 Jul 2008 5:21 UTC; 38 points) (
- Expecting Beauty by 12 Jan 2008 3:00 UTC; 27 points) (
- 29 Mar 2012 1:01 UTC; 22 points) 's comment on Harry Potter and the Methods of Rationality discussion thread, part 13, chapter 81 by (
- 21 Mar 2011 3:58 UTC; 18 points) 's comment on How to Convince Me That 2 + 2 = 3 by (
- 2 Mar 2010 19:44 UTC; 13 points) 's comment on Rationality quotes: March 2010 by (
- A Cruciverbalist’s Introduction to Bayesian reasoning by 4 Apr 2021 8:50 UTC; 11 points) (
- Can we always assign, and make sense of, subjective probabilities? by 17 Jan 2020 3:05 UTC; 11 points) (
- 4 Jan 2012 7:32 UTC; 11 points) 's comment on Singularity Institute $100,000 end-of-year fundraiser only 20% filled so far by (
- Rationality Reading Group: Part E: Overly Convenient Excuses by 16 Jul 2015 3:38 UTC; 11 points) (
- 23 May 2010 11:20 UTC; 10 points) 's comment on Open Thread: May 2010, Part 2 by (
- A Conversation with GoD by 23 Aug 2016 7:59 UTC; 9 points) (
- 8 Sep 2016 12:04 UTC; 9 points) 's comment on Open Thread, Aug 29. - Sept 5. 2016 by (
- 2 Jul 2012 15:49 UTC; 8 points) 's comment on Can anyone explain to me why CDT two-boxes? by (
- [SEQ RERUN] Infinite Certainty by 19 Dec 2011 5:30 UTC; 8 points) (
- 20 Mar 2010 22:26 UTC; 7 points) 's comment on Open Thread: March 2010, part 3 by (
- P: 0 ⇐ P ⇐ 1 by 27 Aug 2017 21:57 UTC; 7 points) (
- Risk and uncertainty: A false dichotomy? by 18 Jan 2020 3:09 UTC; 6 points) (
- 11 Sep 2021 13:40 UTC; 5 points) 's comment on When pooling forecasts, use the geometric mean of odds by (EA Forum;
- 10 Aug 2015 17:30 UTC; 5 points) 's comment on Open thread, Aug. 10 - Aug. 16, 2015 by (
- 19 Jan 2008 6:32 UTC; 5 points) 's comment on The Allais Paradox by (
- 16 Jan 2015 11:21 UTC; 4 points) 's comment on Je suis Charlie by (
- 4 Apr 2011 19:07 UTC; 4 points) 's comment on Philosophy: A Diseased Discipline by (
- 31 Jul 2017 23:14 UTC; 4 points) 's comment on What Are The Chances of Actually Achieving FAI? by (
- Can Bayes theorem represent infinite confusion? by 22 Mar 2019 18:02 UTC; 4 points) (
- 27 Jul 2016 23:32 UTC; 4 points) 's comment on Open thread, Jul. 25 - Jul. 31, 2016 by (
- 16 Jun 2012 16:41 UTC; 4 points) 's comment on How confident is your atheism? by (
- 26 Feb 2008 20:22 UTC; 3 points) 's comment on Leave a Line of Retreat by (
- 6 Jan 2022 17:21 UTC; 3 points) 's comment on The Map-Territory Distinction Creates Confusion by (
- 11 May 2011 22:53 UTC; 2 points) 's comment on Holy Books (Or Rationalist Sequences) Don’t Implement Themselves by (
- 16 Nov 2017 23:58 UTC; 2 points) 's comment on Less Wrong Lacks Representatives and Paths Forward by (
- 11 Dec 2010 5:57 UTC; 1 point) 's comment on Newcomb’s Problem and Regret of Rationality by (
- 19 Jul 2023 17:00 UTC; 1 point) 's comment on Secret Cosmos: Introduction by (
- How to convince Y that X has committed a murder with >0.999999 probability? by 19 May 2020 22:55 UTC; 1 point) (
- 29 May 2013 0:30 UTC; 1 point) 's comment on Requesting advice: Doing Epistemology Right (Warning: Abstract mainstream Philosophy herein) by (
- 18 Jul 2020 16:49 UTC; 1 point) 's comment on Bob Jacobs’s Shortform by (
- 5 Jul 2012 23:35 UTC; 1 point) 's comment on Irrationality Game II by (
- 12 May 2009 12:59 UTC; 1 point) 's comment on No One Knows Stuff by (
- 13 Dec 2011 14:16 UTC; 1 point) 's comment on two puzzles on rationality of defeat by (
- 21 Oct 2016 22:59 UTC; 0 points) 's comment on How to Convince Me That 2 + 2 = 3 by (
- 24 May 2011 1:53 UTC; 0 points) 's comment on Bullying the Integers by (
- 7 Nov 2011 18:43 UTC; 0 points) 's comment on Human consciousness as a tractable scientific problem by (
- 15 Oct 2015 6:11 UTC; 0 points) 's comment on Open thread, Oct. 12 - Oct. 18, 2015 by (
- 19 Apr 2008 23:56 UTC; 0 points) 's comment on Identity Isn’t In Specific Atoms by (
- 5 Aug 2008 18:03 UTC; 0 points) 's comment on Anthropomorphic Optimism by (
- 7 Apr 2015 18:11 UTC; 0 points) 's comment on Feeling Rational by (
- 2 Feb 2014 6:30 UTC; 0 points) 's comment on Skepticism about Probability by (
- 20 Feb 2013 19:01 UTC; 0 points) 's comment on Falsifiable and non-Falsifiable Ideas by (
- 7 Jan 2016 1:01 UTC; 0 points) 's comment on The Number Choosing Game: Against the existence of perfect theoretical rationality by (
- 3 May 2013 1:09 UTC; 0 points) 's comment on Open Thread, May 1-14, 2013 by (
- 30 Jan 2013 21:43 UTC; 0 points) 's comment on Simulating Problems by (
- 7 Aug 2014 20:18 UTC; 0 points) 's comment on Open thread, August 4 − 10, 2014 by (
- 18 Jul 2011 20:41 UTC; 0 points) 's comment on To Speak Veripoop by (
- 18 May 2008 20:43 UTC; 0 points) 's comment on No Safe Defense, Not Even Science by (
- 4 Jan 2013 21:18 UTC; -2 points) 's comment on Second-Order Logic: The Controversy by (
- 14 Jun 2012 22:36 UTC; -2 points) 's comment on How confident is your atheism? by (
- 20 Jul 2023 15:28 UTC; -8 points) 's comment on Secret Cosmos: Introduction by (
Thanks, Eliezer. Helpful post.
I have personally witnessed a room of people nod their heads in agreement with a definition of a particular term in software testing. Then when we discussed examples of that term in action, we discovered that many of us having agreed with the words in the definition, had a very different interpretation of those words. To my great discouragement, I learned that agreeing on a sign is not the same as agreeing on the interpretant or the object. (sign, object, and interpretant are the three parts of Peirce’s semiotic triangle)
In the case of 2+2=4, I think I know what that means, but when Euclid, Euler, or Laplace thought of 2+2=4, were they thinking the same thing I am? Maybe they were, but I’m not confident of that. And when someday a artificial intelligence ponders 2+2=4, will it be thinking what I’m thinking?
I feel 100% positive that 2+2=4 is true, and 100% positive that I don’t entirely know what I mean by “2+2=4”. I am also not entirely sure what other people mean by it. Maybe they mean “any two objects, combined with two objects, always results in four objects”, which is obviously not true.
In thinking about certainty, it helps me to consider the history of the number zero. That something so obvious could be unknown (or unrecognized as important) for so long is sobering. The Greeks would also have sworn that the square root of negative one has no meaning and certainly no use in mathematics. 100% certain! The Pythagoreans would have sworn it just before stoning you to death for math heresy.
We can go even stronger than mathematical truths. How about the following statement?
~(P &~P)
I think it’s safe to say that if anything is true, that statement (the flipping law of non-contradiction) is true. And it’s the precondition for any other knowledge (for no other reason than if you deny it, you can prove anything). I mean, there are logics that permit contradictions, but then you’re in a space that’s completely alien to normal reasoning.
So that’s lots stronger than 2+2=4. You can reason without 2+2=4. Maybe not very well, but you can do it.
So Eliezer, do you have a probability of 1 in the law of non-contradiction?
The truth of probability theory itself depends on non-contradiction, so I don’t really think that probability is a valid framework for reasoning about the truth of fundamental logic, because if logic is suspect probability itself becomes suspect.
If you get past that one, I’ll offer you another.
“There is some entity [even if only a simulation] that is having this thought.” Surely you have a probability of 1 in that. Or you’re going to have to answer to Descartes’s upload, yo.
Well, maybe you fell asleep halfway that thought, and thought the last half after you woke, without noticing you slept.
That doesn’t answer it. You still had the thought, even with some time lapse. But even if you somehow say that doesn’t count, a trivial fix which that supposition totally cannot answer would be “There is some entity [even if only a simulation] that is having at least a portion of this thought”.
If the goal here is to make a statement to which one can assign probability 1, how about this: something exists. That would be quite difficult to contradict (albeit it has been done by non-realists).
Is “exist” even a meaningful term? My probability on that is highish but no where near unity.
My attempts to taboo “exist” led me to instrumentalism, so beware.
Is instrumentalism such a bad thing, though? It seems like instrumentalism is a better generalization of Bayesian reasoning than scientific realism, and it approaches scientific realism asymptotically as your prior for “something exists” approaches 1. (Then again, I may have been thoroughly corrupted in my youth by the works of Robert Wilson).
If you take instrumentalism seriously, then you remove external “reality” as meaningless, and only talk about inputs (and maybe outputs) and models. Basically in this diagram
from Update then Forget you remove the top row of W’s, leaving dangling arrows where “objective reality” used to be. This is not very aesthetically satisfactory, since the W’s link current actions to future observations, and without them the causality is not apparent or even necessary. This is not necessarily a bad thing, if you take care to avoid the known AIXI pitfalls of wireheading and anvil dropping. But this is certainly not one of the more popular ontologies.
“Exist” is meaningful in the sense that “true” is meaningful, as described in EY’s The Simple Truth. I’m not really sure why anyone cares about saying something with probability 1 though; no matter how carefully you think about it, there’s always the chance that in a few seconds you’ll wake up and realize that even though it seems to make sense now, you were actually spouting gibberish. Your brain is capable of making mistakes while asserting that it cannot possibly be making a mistake, and there is no domain on which this does not hold.
I must raise an objection to that last point, there are 1 or more domain(s) on which this does not hold. For instance, my belief that A→A is easily 100%, and there is no way for this to be a mistake. If you don’t believe me, substitute A=”2+2=4″. Similarly, I can never be mistaken in saying “something exists” because for me to be mistaken about it, I’d have to exist.
You could be mistaken about logic, a demon might be playing tricks on you etc.
You can say “Sherlock Holmes was correct in his deduction.” That does not rely on Sherlock Holmes actually existing, it’s just noting a relation between one concept (Sherlock Holmes) and another (a correct deduction).
What would you say, if asked to defend this possibility?
This is true, but (at least if we’re channeling Descartes) the question is whether or not we can raise a doubt about the truth of the claim that something exists. Our ability to have this thought doesn’t prove that it’s true, but it may well close off any doubts.
The complexity based prior for living in such a world is very low, but non-zero. Consequently, you can’t be straight 1.0 convinced it’s not the case.
A teapot could actually be an alien spaceship masquerading as a teapot-lookalike. That possibility is heavily, heavily discounted against using your favorite version of everyone’s favorite heuristic (Occam’s Razor). However, since it can be formulated (with a lot of extra bits), its probability is non-zero. Enough to reductio the “easily 100%”.
Well, this is a restatement of the claim that it’s possible to be deceived about tautologies, not a defense of that claim. But your post clarifies the situation quite a lot, so maybe I can rephrase my request: how would you defend the claim that it is possible (with any arbitrarily large number of bits) to formulate a world in which a contradictions is true?
I admit I for one don’t know how I would defend the contrary claim, that no such world could be formulated.
Probably heavily depends on the meaning of “formulate”, “contradiction” and “true”. For example, what’s the difference between “imagine” and “formulate”? In other words, with “any arbitrarily large number of bits” you can likely accurately “formulate” a model of the human brain/mind which imagines “a world in which a contradiction is true”.
I mean whatever Kawoomba meant, and so he’s free to tell me whether or not I’m asking for something impossible (though that would be a dangerous line for him to take).
Is your thought that unless we can (with certainty) rule out the possibility of such a model or rule out the possibility that this model represents a world in which a contradiction is true, then we can’t call ourselves certain about the law of non-contradiction? I grant that the falsity of that disjunct seems far from certain.
I am not a mathematician, but to me the law of non-contradiction is something like a theorem in propositional calculus, unrelated to a particular world. A propositional calculus may or may not be a useful model, depends on the application, of course. But I suppose this is straying dangerously close to the discussion of instrumentalism, which led us nowhere last time we had it.
It seems more like an axiom to me than a theorem: I know of no way to argue for it that doesn’t presuppose it. So I kind of read Aristotle for a living (don’t laugh), and he takes an interesting shot at arguing for the LNC: he seems to say it’s simply impossible to formulate a contradiction in thought, or even in speech. The sentence ‘this is a man and not a man’ just isn’t genuine proposition.
That doesn’t seem super plausible, however interesting a strategy it is, and I don’t know of anything better.
This seems like a version of “no true Scotsman”. Anyway, I don’t know much about Aristotle’s ideas, but what I do know, mostly physics-related, either is outright wrong or has been obsolete for the last 500 years. If this is any indication, his ideas on logic are probably long superseded by the first-order logic or something, and his ideas on language and meaning by something else reasonably modern. Maybe he is fun to read from the historical or literary perspective, I don’t know, but I doubt that it adds anything to one’s understanding of the world.
Well, his argument consists of more than the above assertion (he lays out a bunch of independent criteria for the expression of a thought, and argues that contradictions can never satisfy them). However I can’t disagree with you on this: no one reads Aristotle to learn about physics or logic or biology or what-have-you. To say that modern versions are more powerful, more accurate, and more useful is massive understatement. People still read Aristotle as a relevant ethical philosopher, though I have my doubts as to how useful he can be, given that he was an advocate for slavery, sexism, infanticide, etc. Not a good start for an ethicist.
On the other hand, almost no contemporary logicians think contradictions can be true, but no one I know of has an argument for this. It’s just a primitive.
This is true, but (at least if we’re channeling Descartes) the question is whether or not we can raise a doubt about the truth of the claim that something exists. Our ability to have this thought doesn’t prove that it’s true, but it may well close off any doubts.
Sure, it sounds pretty reasonable. I mean, it’s an elementary facet of logic, and there’s no way it’s wrong. But, are you really, 100% certain that there is no possible configuration of your brain which would result in you holding that A implies not A, while feeling the exact same subjective feeling of certainty (along with being able to offer logical proofs, such that you feel like it is a trivial truth of logic)? Remember that our brains are not perfect logical computers; they can make mistakes. Trivially, there is some probability of your brain entering into any given state for no good reason at all due to quantum effects. Ridiculously unlikely, but not literally 0. Unless you believe with absolute certainty that it is impossible to have the subjective experience of believing that A implies not A in the same way you currently believe that A implies A, then you can’t say that you are literally 100% certain. You will feel 100% certain, but this is a very different thing than actually literally possessing 100% certainty. Are you certain, 100%, that you’re not brain damaged and wildly misinterpreting the entire field of logic? When you posit certainty, there can be literally no way that you could ever be wrong. Literally none. That’s an insanely hard thing to prove, and subjective experience cannot possibly get you there. You can’t be certain about what experiences are possible, and that puts some amount of uncertainty into literally everything else.
So by that logic I should assign a nonzero probability to ¬(A→A). And if something has nonzero probability, you should bet on it if the payout is sufficiently high. Would you bet any amount of money or utilons at any odds on this proposition? If not, then I don’t believe you truly believe 100% certainty is impossible. Also, 100% certainty can’t be impossible, because impossibility implies that it is 0% likely, which would be a self-defeating argument. You may find it highly improbable that I can truly be 100% certain. What probability do you assign to me being able to assign 100% probability?
Yes, 0 is no more a probability than 1 is. You are correct that I do not assign 100% certainty to the idea that 100% certainty is impossible. The proposition is of precisely that form though, that it is impossible—I would expect to find that it was simply not true at all, rather than expect to see it almost always hold true but sometimes break down. In any case, yes, I would be willing to make many such bets. I would happily accept a bet of one penny, right now, against a source of effectively limitless resources, for one example.
As to what probability you assign; I do not find it in the slightest improbable that you claim 100% certainty in full honesty. I do question, though, whether you would make literally any bet offered to you. Would you take the other side of my bet; having limitless resources, or a FAI, or something, would you be willing to bet losing it in exchange for a value roughly equal to that of a penny right now? In fact, you ought to be willing to risk losing it for no gain—you’d be indifferent on the bet, and you get free signaling from it.
Indeed, I would bet the world (or many worlds) that (A→A) to win a penny, or even to win nothing but reinforced signaling. In fact, refusal to use 1 and 0 as probabilities can lead to being money-pumped (or at least exploited, I may be misusing the term “money-pump”). Let’s say you assign a 1/10^100 probability that your mind has a critical logic error of some sort, causing you to bound probabilities to the range of (1/10^100, 1-1/10^100) (should be brackets but formatting won’t allow it). You can now be pascal’s mugged if the payoff offered is greater than the amount asked for by a factor of at least 10^100. If you claim the probability is less than 10^100 due to a leverage penalty or any other reason, you are admitting that your brain is capable of being more certain than the aforementioned number (and such a scenario can be set up for any such number).
That’s not how decision theory works. The bounds on my probabilities don’t actually apply quite like that. When I’m making a decision, I can usefully talk about the expected utility of taking the bet, under the assumption that I have not made an error, and then multiply that by the odds of me not making an error, adding the final result to the expected utility of taking the bet given that I have made an error. This will give me the correct expected utility for taking the bet, and will not result in me taking stupid bets just because of the chance I’ve made a logic error; after all, given that my entire reasoning is wrong, I shouldn’t expect taking the bet to be any better or worse than not taking it. In shorter terms: EU(action) = EU(action & ¬error) + EU(action & error); also EU(action & error) = EU(anyOtherAction & error), meaning that when I compare any 2 actions I get EU(action) - EU(otherAction) = EU(action & ¬error) - EU(otherAction & ¬error). Even though my probability estimates are affected by the presence of an error factor, my decisions are not. On the surface this seems like an argument that the distinction is somehow trivial or pointless; however, the critical difference comes in the fact that while I cannot predict the nature of such an error ahead of time, I can potentially recover from it iff I assign >0 probability to it occurring. Otherwise I will never ever assign it anything other than 0, no matter how much evidence I see. In the incredibly improbable event that I am wrong, given extraordinary amounts of evidence I can be convinced of that fact. And that will cause all of my other probabilities to update, which will cause my decisions to change.
Your calculations aren’t quite right. You’re treating
EU(action)
as though it were a probability value (likeP(action)
).EU(action)
would be more logically writtenE(utility | action)
, which itself is an integral overutility * P(utility | action)
forutility∈(-∞,∞)
, which, due to linearity of*
and integrals, does have all the normal identities, likeE(utility | action) = E(utility | action, e) * P(e | action) + E(utility | action, ¬e) * P(¬e | action)
.In this case, if you do expand that out, using
p<<1
for the probability of an error, which is independent of your action, and assumingE(utility|action1,error) = E(utility|action2,error)
, you getE(utility | action) = E(utility | error) * p + E(utility | action, ¬error) * (1 - p)
. Or for the difference between two actions,EU1 - EU2 = (EU1' - EU2') * (1 - p)
whereEU1', EU2'
are the expected utilities assuming no errors.Anyway, this seems like a good model for “there’s a superintelligent demon messing with my head” kind of error scenarios, but not so much for the everyday kind of math errors. For example, if I work out in my head that 51 is a prime number, I would accept an even odds bet on “51 is prime”. But, if I knew I had made an error in the proof somewhere, it would be a better idea not to take the bet, since less than half of numbers near 50 are prime.
Right, I didn’t quite work all the math out precisely, but at least the conclusion was correct. This model is, as you say, exclusively for fatal logic errors; the sorts where the law of non-contradiction doesn’t hold, or something equally unthinkable, such that everything you thought you knew is invalidated. It does not apply in the case of normal math errors for less obvious conclusions (well, it does, but your expected utility given no errors of this class still has to account for errors of other classes, where you can still make other predictions).
The usage of “money-pump” is correct.
(Do note, however, that using 1 and 0 as probabilities when you in fact do not have that much certainty also implies the possibility for exploitation, and unlike the money pump scenario you do not even have the opportunity to learn from the first exploitation and self correct.)
A lot of this is a framing problem. Remember that anything we’re discussing here is in human terms, not (for example) raw Universal Turing Machine tape-streams with measurable Komolgorov complexities. So when you say “what probability do you assign to me being able to assign 100% probability”, you’re abstracting a LOT of little details that otherwise need to be accounted for.
I.e., if I’m computing probabilities as a set of propositions, each of which is a computable function that might predict the universe and a probability that I assign to whether it accurately does so, and in all of those computable functions my semantic representation of ‘probability’ is encoded as log odds with finite precision, then your question translates into a function which traverses all of my possible worlds, looks to see if one of those probabilities that refers to your self-assigned probability is encoded as the number ‘INFINITY’, multiplies that by the probability that I assigned that world being the correct one, and then tabulates.
Since “encoded as log odds with finite precision” and “encoded as the number ‘INFINITY’” are not simultaneously possible given certain encoding schemes, this really resolves itself to “do I encode floating-point numbers using a mantissa notation or other scheme that allows for values like +INF/-INF/+NaN/-NaN?”
Which sounds NOTHING like the question you asked, but it the answers do happen to perfectly correlate (to within the precision allowed by the language we’re using to communicate right now).
Did that make sense?
When I say 100% certainty is impossible, I mean that there are no cases where assigning 100% to something is correct, but I have less than 100% confidence in this claim. It’s similar to the claim that it’s impossible to travel faster than the speed of light.
If any agent within a system were able to assign a 1 or 0 probability to any belief about that system being true, that would mean that the map-territory divide would have been broken.
However, since that agent can never rule out being mistaken about its own ontology, its reasoning mechanism, following an invisible (if vanishingly unlikely) internal failure, it can never gain final certainty about any feature of territory, although it can get arbitrarily close.
What evidence convinces you now that something exists? What would the world look like if it were not the case that something existed?
Imagine yourself as a brain in a jar, without the brain and the jar. Would you remain convinced that something existed if confronted with a world that had evidence against that proposition?
Also (and sorry for the rapid-fire commenting), do you accept that we can have conditional probabilities of one? For example, P(A|A)=1? And, for that matter, P(B|(A-->B, A))=1? If so, I believe I can force you to accept at least probabilities of 1 in sound deductive arguments. And perhaps (I’ll have to think about it some more) in the logical laws that get you to the sound deductive arguments. I’m just trying to get the camel’s nose in the tent here...
The same holds for mathematical truths. It’s questionable whether the statement “2 + 2 = 4” or “In Peano arithmetic, SS0 + SS0 = SSSS0″ can be said to be true in any purely abstract sense, apart from physical systems that seem to behave in ways similar to the Peano axioms.
Why is that important?
Let me ask you in reply, Paul, if you think you would refuse to change your mind about the “law of non-contradiction” no matter what any mathematician could conceivably say to you—if you would refuse to change your mind even if every mathematician on Earth first laughed scornfully at your statement, then offered to explain the truth to you over a couple of hours… Would you just reply calmly, “But I know I’m right,” and walk away? Or would you, on this evidence, update your “zero probability” to something somewhat higher?
Why can’t I repose a very tiny credence in the negation of the law of non-contradiction? Conditioning on this tiny credence would produce various null implications in my reasoning process, which end up being discarded as incoherent—I don’t see that as a killer objection.
In fact, the above just translates the intuitive reply, “What if a mathematician convinces me that ‘snow is white’ is both true and false? I don’t consider myself entitled to rule it out absolutely, but I can’t imagine what else would follow from that, so I’ll wait until it happens to worry about it.”
As for Descartes’s little chain of reasoning, it involves far too many deep, confusing, and ill-defined concepts to be assigned a probability anywhere near 1. I am not sure anything exists, let alone that I do; I am far more confident that angular momentum is conserved in this universe than I am that the statement “the universe exists” represents anything but confusion.
The one that I confess is giving me the most trouble is P(A|A). But I would prefer to call that a syntactic elimination rule for probabilistic reasoning, or perhaps a set equality between events, rather than claiming that there’s some specific proposition that has “Probability 1”.
I don’t know what the above sentence means. You must be using the word “exist” differently than I do.
This seems to me to be a very different question. “Do I doubt A?” and “Could any experience lead me to doubt A?” are different questions. They are equivalent for ideal reasoners. And we approximate ideal reasoners closely enough that treating the questions as interchangeable is typically a useful heuristic. Nonetheless, if absolute certainty is an intelligible concept at all, then I can imagine
being absolutely certain now that A is true, while
thinking it likely that some stream of words or experiences in the future could so confuse or corrupt me that I would doubt A.
But, if I allow that I could be corrupted into doubting what I am now certain is true, how can I be certain that my present certainty isn’t a result of such a corruption? At this point, my recursive justification would hit bottom: I am certain that my evaluation of P(A) as equal to 1 is not the result of a corruption because I am certain that A is true. Sure, the corrupted future version of myself would look back on my present certainty as mistaken. But that version of me is corrupted, so why would I listen to him?
ETA:
In your actual scenario, where all other mathematicians scorn my belief that ~(P&~P), I would probably conclude that everyone is doing something very different with logical symbols than what I thought that they were doing. If they persisted in not understanding why I thought that ~(P&~P) followed from the nature of conjunction, I would conclude that my brain works in such a different way that I cannot even map my concepts of basic logical operation into the concepts that other people use. I would start to doubt that my concept of conjunction is as useful as I thought (since everyone else apparently prefers some alternative), so I would spend a lot of effort trying to understand the concepts that they use in place of mine. I would consider it pretty likely that I would choose to use their concepts as soon as I understood them well-enough to do so.
Huh, I must be slowed down because it’s late at night… P(A|A) is the simplest case of all. P(x|y) is defined as P(x,y)/P(y). P(A|A) is defined as P(A,A)/P(A) = P(A)/P(A) = 1. The ratio of these two probabilities may be 1, but I deny that there’s any actual probability that’s equal to 1. P(|) is a mere notational convenience, nothing more. Just because we conventionally write this ratio using a “P” symbol doesn’t make it a probability.
But it does obey the Kolmogorov axioms (it can’t be greater than 1 for instance); that seems important.
Hah, I’ll let Decartes go (or condition him on a workable concept of existence—but that’s more of a spitball than the hardball I was going for).
But in answer to your non-contradiction question… I think I’d be epistemically entitled to just sneer and walk away. For one reason, again, if we’re in any conventional (i.e. not paraconsistent) logic, admitting any contradiction entails that I can prove any proposition to be true. And, giggle giggle, that includes the proposition “the law of non-contradiction is true.” (Isn’t logic a beautiful thing?) So if this mathematician thinks s/he can argue me into accepting the negation of the law of non-contradiction, and takes the further step of asserting any statement whatsoever to which it purportedly applies (i.e. some P, for which P&~P, such as the whiteness of snow), then lo and behold, I get the law of non-contradiction right back.
I suppose if we wanted to split hairs, we could say that one can deny the law of non-contradiction without further asserting an actual statement to which that denial applies—i.e. ~(~(P&~P)) doesn’t have to entail the existence of a statement P which is both true and false ((∃p)Np, where N stands for “is true and not true?” Abusing notation? Never!) But then what would be the point of denying the law?
(That being said, what I’d actually do is stop long enough to listen to the argument—but I don’t think that commits me to changing my zero probability. I’d listen to the argument solely in order to refute it.)
As for the very tiny credence in the negation of the law of non-contradiction (let’s just call it NNC), I wonder what the point would be, if it wouldn’t have any effect on any reasoning process EXCEPT that it would create weird glitches that you’d have to discard? It’s as if you deliberately loosened one of the spark plugs in your engine.
There are, apparently, certain Eastern philosophies that permit and even celebrate logical contradiction. To what extent this is metaphorical I couldn’t say, but I recently spoke to an adherent who quite firmly believed that a given statement could be both true and false. After some initial bewilderment, I verified that she wasn’t talking about statements that contained both true and false claims, or were informal and thus true or false under different interpretations, but actually meant what she’d originally seemed to mean.
I didn’t at first know how to argue such a basic axiom—it seemed like trying to talk a rock into consciousness—but on reflection, I became increasingly uncertain what her assertion would even mean. Does she, when she thinks “Hmm, this is both true and false” actually take any action different than I would? Does belief in NNC wrongly constrain some sensory anticipation? As Paul notes, need the law of non-contradiction hold when not making any actual assertions?
All this is to say that the matter which at first seemed very simple became confusing along a number of axes, and though Paul might call any one of these complaints “splitting hairs” (as would I), he would probably claim this with far less certainty than his original 100% confidence in NNC’s falsehood: That is, he might be more open-minded about a community of mathematicians explaining why actually some particular complaint isn’t splitting hairs at all and is highly important for some non-obvious reasons and due to some fundamental assumptions being confused it would be misleading to call NNC ‘false’.
But more simply, I think Paul may have failed to imagine how he would actually feel in the actual situation of a community of mathematicians telling him that he was wrong. Even more simply, I think we can extrapolate a broader mistake of people who are presented with the argument against infinite certainty replying with a particular thing they’re certain about, and claiming that they’re even more certain about their thing than the last person to try a similar argument. Maybe the correct general response to this is to just restate Eliezer’s reasoning about any 100% probability simply being in the reference class of other 100% probabilities, less than 100% of which are correct.
That would be Jain logic.
(Note: This comment is not really directed at Paul himself, seeing as he’s long gone, but at anyone who shares the sentiments he expresses in the above comment)
Note that there is almost certainly at least one person out there who is insane, drugged up, or otherwise cognitively impaired, who believes that the Law of Non-Contradiction is in fact false, is completely and intuitively convinced of this “fact”, and who would sneer at any mathematician who tried to convince him/her otherwise, before walking away. Do you in fact assign 100% probability to the hypothesis that you are not that drugged-up person?
Wait a second, conditional probabilities aren’t probabilities? Huhhh? Isn’t Bayesianism all conditional probabilities?
P(P is never equal to 1) = ?
I know, I know, ‘this statement is not true’. But we’ve long since left the real world anyway. However, if you tell me the above is less than one, that means that in some cases, infinite certainty can exist, right?
Get some sleep first though Eliezer and Paul. It’s 9.46am here.
He answered that.
It means that there might be cases where infinite certainty can exist. There also might be cases where the speed of light can be exceeded, conservation of energy can be violated, etc. There probably aren’t cases of any of these.
For one reason, again, if we’re in any conventional (i.e. not paraconsistent) logic, admitting any contradiction entails that I can prove any proposition to be true.
Yes, but conditioned on the truth of some statement P&~P, my probability that logic is paraconsistent is very high.
Bayesianism is all about ratios of probabilities, yes, but we can write these ratios without ever using the P(|) notation if we please.
“I’d listen to the argument solely in order to refute it.”
Paul refutes the data! Eliezer, an idiot disagreeing with you shouldn’t necessarily shift your beliefs at all. By that token, there’s no reason to shift your beliefs if the whole world told you 2 + 2 were 3, unless they showed some evidence. I would think it vastly more likely that the whole world was pulling my leg.
Sometimes I feel like religion is the whole world pulling my leg.
Assert a confidence of (1 − 1/googolplex) and your ego far exceeds that of mental patients who think they’re God.
So we are considering the possibility of brain malfunctions, and deities changing reality. Fine. But what is the use of having a strictly accurate Bayesian reasoning process when your brain is malfunctioning and/or deities are changing the parameters of reality?
Eliezer, I want to complement you on this post. But I would suggest that you apply it more generally, not only to mathematics. For example, it seems to me that any of us should be (or rather, could be after thinking about it for a while) more sure that 53 is a prime number than that a creationist with whom we disagree is wrong. This seems to imply that our certainty of the theory of evolution shouldn’t be more than 99.99%, according to your figure, definitely less than a string of nines as long as the Bible (as you have rhetorically suggested in the past.)
A string of nines as long as the Bible is really, really long.
But if we aren’t willing to assign probabilities over some arbitrary limit (other than 1 itself), we’ve got some very serious problems in our epistemology. I would assign a probability to the Modern Synthesis somewhere around 0.99999999999999 myself.
If proposition An is the proposition “the nth person gets struck by lightning tomorrow”, then consider the following conjunction, n going of course from 1 to 7 billion: P(A1 & A2 & … & A7e9) Now consider the negation of this conjunction: P(~(A1 & ~A2 & … A7e9))
I had damn well better be able to assign a probability greater than 0.9999 to the negation, or else I couldn’t assign a probability lower than 0.0001 to the original conjunction. And then I’m estimating a 1/10000 chance of everyone on Earth getting struck by lightning on any given day, which means it should have happened several times in the last century. Also, I can’t assign a probability of any one person being struck as less than 1/10000, because obviously that person must get struck if everyone is to be struck.
Paul Gowder said:
“We can go even stronger than mathematical truths. How about the following statement?
~(P &~P)
I think it’s safe to say that if anything is true, that statement (the flipping law of non-contradiction) is true.”
Amusingly, this is one of the more controversial tautologies to bring up. This is because constructivist mathematicians reject this statement.
No, they reject P V ~P.
They do not reject ~(P&~P). Only paraconsistent logicians do that.
And paraconsistent logicians are silly.
Gray Area said: “Amusingly, this is one of the more controversial tautologies to bring up. This is because constructivist mathematicians reject this statement.”
Actually constructivist mathematicians reject the law of the excluded middle, (P v ~P), not the law of non-contradiction (they are not equivalent in intuitionistic logic, the law of non-contradiction is actually equivalent to the double negation of the excluded middle).
The ratio of these two probabilities may be 1, but I deny that there’s any actual probability that’s equal to 1. P(|) is a mere notational convenience
I’d have to diagree with that. The axioms I’ve seen of probabilty/measure theory do not make the case that P() is a probability while P(|) is not—they are both, ulitmately, the same type of object (just taken from different measurable sets).
However, you don’t need to appeal to this type of reasoning to get rid of P(A|A) = 1. Your probability of correctly remembering the beginning of the statement when reaching the end is not 1 - hence there is room for doubt. Even your probability of correctly understanding the statement is not 1.
P(P is never equal to 1) = ?
I know, I know, ‘this statement is not true’.
Would this be an argument for allowing “probabilities of probabilities”? So that you can assign 99.9999% (that’s enough 9′s I feel) to the statement “P(P is never equal to 1)”.
If you say 99.9999% confidence, you’re implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once.
Excellent post overall, but that part seems weakest—we suffer from an unavailability problem, in that we can’t just think up random statements with those properties. When I said I agreed 99.9999% with “P(P is never equal to 1)” it doesnt’t mean that I feel I could produce such a list—just that I have a very high belief that such a list could exist.
An intermediate position would be to come up with a hundred equally fraught statements in a randomly chosen narrow area, and extrapoltate from that result.
Stuart: When I said I agreed 99.9999% with “P(P is never equal to 1)” it doesnt’t mean that I feel I could produce such a list—just that I have a very high belief that such a list could exist.
So, using Eliezer’s logic, would you expect that one time in a million, you’d get this wrong, and P = 1? I don’t need to you to produce a list. This is a case where no number of 9s will sort you out—if you assign a probability less than 1, you expect to be in error at some point, which leaves you up the creek. If I’m making a big fat error (and I fear I may be), someone please set me straight.
Mr. Bach,
I think you’re right to point out that “number” meant a different thing to the Greeks; but I think that should make us more, not less, confident that “2+2=4.” If the Greeks had meant the same thing by number as modern mathematicians do, than they were wrong to be very confident that the square root of negative one was not a number. However, the square root of negative one does in fact fall short of being a simple, definite multitude—what Euclid, at least, meant by number. So if they were in error, it was the practical error of drawing an unnecessary distinction, not a contradictory one.
Perhaps “100% certain” or “P=1″ could mean that I believe something to be true with the same level of certainty as that by which I believe certainty and probability to be coherent terms. We can only evaluate judgments if we accept “judgment” as a valid kind of thought anyway.
Yeah, imagine what a mess it would be to try to rewrite the axioms of probability as themselves probabilistic?
Ben, you’re making an obvious error: you are taking the statement that “P never equals 1” has a probability of less than 1 to mean that in some proportion of cases, we expect the probability to equal 1. This would be the same as supposing that assigning the light-speed limit a probability of less than 1 implies that we think that the speed of light is sometimes exceeded.
But it doesn’t mean this, it means that if we were to enunciate enough supposed physical laws, we would sometimes be mistaken. In the same way, a probability of less than 1 for the proposition that we should never assign a probability of 1 simply means that if we take enough supposed claims regarding mathematics, logic, and probability theory, each of which we take to be as certain as the claim rejecting a probability of unity, we would sometimes be mistaken. This doesn’t mean that any proposition has a probability of unity.
There are uncountably many possible worlds. Using standard real-number-valued probabilities, we have to assign probability zero to (I think) almost all of them. In other words, for almost all of the possible worlds, the probability of the complement of that possible world is 1.
(Are there ways around this, perhaps using non-real-valued probabilities?)
We could use hyperfinites maybe?
(Waking up.) Sure, if I thought I had evidence (how) of P&~P, that would be pretty good reason to believe a paraconsistent logic was true (except what does true mean in this context? not just about logics, but about paraconsistent ones!!)
But if that ever happened, if we went there, the rules for being rational would be so radically changed that there wouldn’t necessarily be good reason to believe that one has to update one’s probabilities in that way. (Perhaps one could say the probability of the law of non-contradiction being true is both 1 and 0? Who knows?)
I think the problem with taking a high probability that logic is paraconsistent is that all other beliefs stop working. I don’t know how to think in a paraconsistent logic. And I doubt anyone else does either. (Can you get Bayes Rule out of a paraconsistent logic? I doubt it. I mean, maybe… who knows?)
The proposition in which I repose my confidence is the proposition that “2 + 2 = 4 is always and exactly true”, not the proposition “2 + 2 = 4 is mostly and usually true”.
I have confused the map with the territory. Apologies. Revised claim: I believe, with 99.973% probability, that P cannot equal 1, 100% of the time! I believe very strongly that I am correct, and if I am correctly, I am completely correct. But I’m not sure. Much better.
I suppose we should be asking ourselves why we tend to try hard to retain the ability to be 100% sure. A long long list of reasons spring to mind....
Well, the real reason why it is useful in arithmetic to accept that 2+2=4 is that this is part of a deeper relation in the arithmetic field regarding relations between the three basic arithmetic operations: addition, multiplication, and exponentiation. Thus, 2 is the solution to the following question: what is x such that x plus x equals x times x equals x to the x power? And, of course, all of these operations equal 4.
Can someone write/has someone written a program that simulates existence in a world in which 2+2=4 (and the rest of Peano arithmetic) is useless i.e. it corresponds to no observable phenomenon in that world?
What would such a simulation look like?
Oh, on the ratios of probabilities thing, whether we call them probabilities or schmobabilities, it still seems like they can equal 1. But if we accept that there are schmobabilities that equal 1, and that we are warranted in giving them the same level of confidence that we’d give probabilities of 1, isn’t that good enough?
Put a different way, P(A|A)=1 (or perhaps I should call it S(A|A)=1) is just equivalent to yet another one of those logical tautologies, A-->A. Which again seems pretty hard to live without. (I’d like to see someone prove NCC to me without binding me to accept NCC!)
Well, the deeper issue is “Must we rely on the Peano axioms?” I shall not get into all the Godelian issues that can arise, but I will note that by suitable reinterpretations, one can indeed pose real world cases where an “apparent two plus another apparent two” do not equal “apparent four,” without being utterly ridiculous. The problem is that such cases are not readily amenable to being easily put together into useful axiomatic systems. There may be something better out there than Peano, but Peano seems to work pretty well an awful lot.
As for “what is really true?” Well… . . . .
Silas, does the “null world” count?
Z._M._Davis: No. Why? Because I said so ;-)
Point taken, I need to better constrain the problem. So, how about, “It must be able to sustain transfer of information between two autonomous agents.” But then I’ve used the concept of “two”, autonomous agent. eek!
So a better specification would be, “The world must contain information.” Or, more rigorously, “The world must have observable phenomena that aid in predicting future phenomena.”
Now, can such a simulated world exist? And is there a whole branch of philosphy addressing this problem that I need to brush up on?
It’s nice that you’re honest and open about the fact that your position presupposes an exceptionally weird sort of skepticism (hence the need to fall back on the possibility of being in The Matrix). Since humans are finite, there’s no reason to think absolute confidence in everything isn’t attainable, just innumerate the biases. Only by positing some weird sort of subjectivism can you get the sort of infinite regress needed to discount the possibility; I can never really know because I’m trapped inside my head. Why is the uncertainty fetish so appealing that people will entertain such weird ideas to retain it?
Why is the certainty fetish so appealing that people will ignore the obvious fact that all conclusions are contingent?
Poke, consideration of the possibility of being in the matrix doesn’t necessarily require “an exceptionally weird sort of skepticism.” It might only require an “exceptionally weird” form of futurism.
If I correctly remember my Jesuit teachers’ explanation from 40 years ago, the epistomological branch of classical philosophy deals thusly with this situation: an “a priori” assertion is one which exhibits the twin characteristics of universality and necessity. 2+2=4 would be such an assertion. Should there ever be an example which violates this a priori assertion, it is simply held to be unreal, because reality is a construct of consensus. Consensus dictates to reality but not to experience. So if, for example, you see a ghost or are abducted by a UFO, you’re simply out of contact with reality, and, as a crazy person, you can’t legitimately challenge what the rest of us hold to be indisputably true.
I hope the gentleman got better.
Eli said:
Peter de Blanc has an amusing anecdote on this point, which he is welcome to retell in the comments.
Here’s the anecdote.
I’m sorry. Eliezer, can you please explain to me what you mean when you say the how certain you are (probability %) that something is true? I’ve studied a lot of statistics, but I really have no idea what you mean.
If I say that this fair coin in my hand has a 50% chance of coming up heads, then that means that if I flip it a lot of times, then it’ll be heads 50% of the times. I can do that with a lot of real, measurable things.
So, what do you mean by, you are 99% certain of something?
It means that, given Eliezer’s knowledge, the probabilities of the necessary preconditions for the state in question multiplied together yield 0.99.
If you have a coin that you believe to be fair, and you flip it, how likely do you think it is that it will land on edge?
Q, Eliezer’s probabilities are Bayesian probabilities. (Note the “Bayesian” tag on the post.)
Q: let’s say I offer you a choice between (a) and (b).
a. Tomorrow morning you can flip that coin in your hand, and if it comes up heads, then I’ll give you a dollar. b. Tomorrow morning, if it is raining, then I will give you a dollar.
If you choose (b) then your probability for rain tomorrow morning must be higher than 1⁄2.
Well… kinda. It could just be that if it rains, you will need to buy a $1 umbrella, but if it doesn’t rain then you don’t need money at all. It would be nice if we had some sort of measurement of reward that didn’t depend on the situation you find yourself in. Decision theorists like to call this “utility.”
I’m not sure if it’s silly to try to define probabilities in terms of decision theory rather than vice versa. ET Jaynes defines probabilities as real numbers that we assign to propositions representing a “degree of plausibility,” and satisfying some desiderata. Eli has lately been talking about probabilities in terms of the fraction of statements assigned that probability which are true, but I don’t think he considers this a definition of probability (I hope not; it would be a bad definition).
Anyway, I’ll say that what makes something a probability is not any property of the thing it references; it’s what you do with it. If you use it to weight hypotheses in expected utility calculations which determine your actions, then it’s a probability.
No, no, no. Three problems, one in the analogy and two in the probabilities.
First, an individual particle can briefly exceeed the speed of light; the group velocity cannot. Go read up on Cerenkov radiation: It’s the blue glow created by (IIRC) neutrons briefly breaking through c, then slowing down. The decrease in energy registers as emitted blue light.
Second: conditional probabilities are not necessarily given by a ratio of densities. You’re conditioning on (or working with) events of measure-zero. These puzzlers are why measure theory exists—to step around the seeming ‘inconsistencies’.
Third: The probability of a probability is superfluous. Probabilities are (thanks to Kolmogorov) just the expectation of indicator variables. Thus P(P()=1) = E(I(E(I())=1)) = 0 or 1; the randomness is all eliminated by the inside expectation.
Leave the musings on probabilities to the statisticians; they’ve already thought about these supposed paradoxes.
I thought it was due to neutrons exceeding the phase velocity of light in a medium, which is invariably slower than c. The neutron is still going slower than c:
Wikipedia
First, an individual particle can briefly exceeed the speed of light; the group velocity cannot. Go read up on Cerenkov radiation: It’s the blue glow created by (IIRC) neutrons briefly breaking through c, then slowing down. The decrease in energy registers as emitted blue light.
Breaking through the speed of light in a medium, but remaining under c (the speed of light in a vacuum).
Thank you.
I’ve actually used Bayesian perspectives (maximum entropy, etc) but I’ve never looked at it as a subjective degree of plausibility. Based on the Wikipedia article, I guess I haven’t been looking at it the way others have. I understand where Eli is coming from in applying Information theory. He doesn’t have complete information, so he won’t say that he has probability 1. He could get another bit of information which changes his belief, but he thinks (based on prior observation) that is very low.
I guess, I have problem with him maybe overreaching. It doesn’t make sense to say that this subjective personal probability (which, by the way, he chose to calculate based on a tiny subset of the vast amounts of information he has in his mind) based on his observed evidence is somehow the absolute probability that, say, evolution is “true”.
It doesn’t make sense to say that this subjective personal probability (which, by the way, he chose to calculate based on a tiny subset of the vast amounts of information he has in his mind) based on his observed evidence is somehow the absolute probability that, say, evolution is “true”.
Where does he? I assume as a Bayesian he would deny the reality of any such “absolute probability”.
There are such things as objective Bayesians, though I’m pretty sure Eliezer is a subjective Bayesian.
Subjectively objective, by his words.
Cumulant-nimbus,
There’s no shortage of statisticians who would disagree with your assertion that the probability of a probability is superfluous. A good place to start is with de Finetti’s theorem.
de Finetti assumes conditioning. If I am taking conditional expectations, then iterated expectations (with different conditionings) is very useful.
But iterated expectations, all with the same conditioning, is superfluous. That’s why I took care not to put any conditioning into my expectations.
Or we can criticize the probability-of-a-probability musings another way as having undefined filtrations for each of the stated probabilities.
“But iterated expectations, all with the same conditioning, is superfluous. That’s why I took care not to put any conditioning into my expectations.”
Fair enough. My point is that the de Finetti theorem provides a way to think sensibly about having a probability of a probability, particularly in a Bayesian framework.
Let me give a toy example to demonstrate why the concept is not superfluous, as you assert. Compare two situations:
(a) I toss a coin that I know to be as symmetrical in construction as possible.
(b) A magician friend of mine, who I know has access to double-headed and double-tailed coins, tosses a coin. I have no idea about the provenance of the coin she is using.
My epistemic probability for the outcome of the toss, in both cases, is 0.5, from symmetry arguments. (Not physical symmetry, epistemic symmetry—that is, symmetry of the available pre-toss information to an interchange of heads and tails.) My epistemic “probability of the probability” of the toss is different in the two cases. In case (a) it is nearly a delta function at 0.5, the sharpness of distribution being a function of my knowledge of the state of the art in symmetrical coin minting. In case (b), it is a mixture of distributions encoding the possible types of coins my friend might have chosen.
And this could make a real difference, if you are shown the product of 5 tosses (they were all heads) and then asked to bet on the following result.
I’m totally missing the “N independent statements” part of the discussion; that seems like a total non-sequitur to me. Can someone point me at some kind of explanation?
-Robin
It’s an oddly frequentist approach to Bayesianism.
Good point about infinite certainty, poor example.
Leaky induction. Didn’t that feel a little forced?
“(the sum of) 2 + 2” means “4“; or to make it more obvious, “1 + 1” means “2”. These aren’t statements about the real world*, hence they’re not subject to falsification, they contain no component of ignorance, and they don’t fall under the purview of probability theory.
*Here your counter has been that meaning is in the brain and the brain is part of the real world. Yet such a line of reasoning, even if it weren’t based on a category error, proves too much: it cuts the ground from under your absolute certainty in the Bayesian approach—the same certainty you needed in order to make accurate statements about 99.99---% probabilities in the first place.
The laws of probability are only useful for rationality if you know when they do and don’t apply.
We can be wrong about what the words we use mean.
What category error would that be?
We don’t have absolute certainty in ‘the Bayesian approach’. It would be counter-productive at best if we did, since then our certainty would be too great for evidence from the world to change our mind, hence we’d have no reason to think that if the evidence did contradict ‘the Bayesian approach’, we’d believe differently. In other words, we’d have no reason as Bayesians to believe our belief, though we’d remain irrationally caught in the grips of that delusion.
Even assuming that it’s a matter of word meanings that the four millionth digit of pi is 0, you can still be uncertain about that fact, and Bayesian reasoning applies to such uncertainty in precisely the same way that it applies to anything else. You can acquire new evidence that makes you revise your beliefs about mathematical theorems, etc.
For the record, I assign a probability larger than 1/googleplex to the possibility that one of the mental patients actually is God.
If you forced me to come up wit 10,000 statements I knew to >=99.99% I would find it easy, given sufficient time. Most of them would be probability much much more than 99.99% however.
Here is a sample of the list: I am not the Duke of Edinburgh. Ronald Mcdonald is not on my roof I am not currently in a bath I am currently making a list of things I believe are highly likely Eliezer Yudowsky is not a paperclip maximising AI I am not the 10,000th sentient being ever to have existed. The Queen is not a cockerspaniel in disguise. I am not a P-zombie.
53 has no prime factors other than itself. (this is much greater certainty; as I can hold in my mind the following facts “the root of 53 is less than 8. 53 is not in the 7 times table. 53 is not in the 5 times table. 53 is not in the 3 times table and 53 is odd” simultaneously. For 53 not to be prime would require, as for 2+2 not to equal 4, that I be very insane. My probability of being that insane is less than 1 in 10,000, and of having that specific insanity is lower still.)
The difficult part is in finding 10,000 statements with precisely 1 in 10,000 odds; not finding 10,000 statements with less than 1 in 10,000 odds.
If you can make a statement every two seconds, you could actually stand up and do this. If I could get sponsorship to offset existential risk, I’d take this challenge on to actually stand up for the best part of a day and make 10,000 true statements with nary a false one.
I would however go for less variety than you if I wanted to be confident of winning this challenge. “My teeth are smaller than Jupiter. The Queen is smaller than Jupiter. A Ford Mondeo is smaller than Jupiter...”
Those statements aren’t even approximately independent though, if Jupiter turns out to be really small, they’re all true. That’s why mine were so weird, the independence clause.*
*(they still aren’t actually independent, but I’m >99.99% sure you couldn’t make a set of statements that were)
However, it’s possible to make a set of statements that are mutually exclusive, which might actually be a superior task “I am not the 11,043rd sentient entity ever to exist. I am not the 21,043rd sentient entity ever to exist, etc.”
I perceive the intention of the original assertion is that even in this case you would still fail in making 10.000 independent statements of such sort—i.e., in trying to do it, you are quite likely somehow make a mistake at least once, say, by a typo, a slip of the tongue, accidentally ommitting ‘not’ or whatever. All it takes to fail on a statement like “53 to be prime” all it takes is for you to not notice that it actually says ’51 is prime’ or make some mistake when dividing.
Any random statement of yours has a ‘ceiling’ of x-nines accuracy.
Even any random statement of yours where it is known that you aren’t rushed, tired, on medication, sober, not sleepy, had a chance and intent to review it several times still has some accuracy ceiling, a couple orders of magnitude higher, but still definitely not 1.
I’m really not sure what exactly you mean by “independent statements” in this post.
If you put a chair next to another chair, and you found that there were three chairs where before there was one, would it be more likely that 1 + 1 = 3 or that arithmetic is not the correct model to describe these chairs? A true mathematical proposition is a pure conduit between its premises and axioms and its conclusions.
But note that you can never be quite completely certain that you haven’t made any mistakes. It is uncertain whether “S0 + S0 = SS0” is a true proposition of Peano arithmetic, because we may all coincidentally have gotten something hilariously wrong.
This is why, when an experiment does not go as predicted, the first recourse is to check that your math has been done correctly.
Eliezer, what could convince you that Baye’s Theorem itself was wrong? Can you properly adjust your beliefs to account for evidence if that adjustment is systematically wrong?
First we’d have to attach a meaning to the claim, yes? I’ve seen evidence for various claims about Bayes’ Theorem, including but probably not limited to ‘Any workable extension of logic to deal with uncertainty will approximate Bayes,’ and ‘Bayes works better in practice than frequentist methods’. Decide which claim you want to talk about and you’ll know what evidence against it would look like.
(Halpern more or less argues against the first one, but I’m looking at his article and so far he just seems to be pointing out Jaynes’ most commonsensical requirements.)
I intended the claim posed here about tests and priors. It is posed as
p(A|X) = [p(X|A)p(A)]/[p(X|A)p(A) + p(X|~A)*p(~A)]
But does it make sense for that to be wrong? It is a theorem, unlike the statement 2+2=4. Maybe some sort of way to show that the axioms and definitions that are used to prove Baye’s Theorem are inconsistent, which is a pretty clear kind of proof. I’m not sure anymore that what I said has meaning. Well, thanks for the help.
Uh, 2+2=4 is most definitely a theorem. A very simple and obvious theorem, yes. But a theorem.
For Godel-Bayes issues, you can start with the responses to my post on the subject. (I’ve since learned and remembered more about Godel.)
We should have the ability to talk about subjective uncertainty in, at the very least, particular proofs and probabilities. I don’t know that we can. But I like the following argument, which I recall seeing here somewhere:
If there exists a perfect probability calculation based on a set of background information, it must take this uncertainty into account. Therefore, applying this uncertainty again to the answer would mean double-counting the evidence, which is strictly verboten. We therefore cannot use this line of reasoning to produce a contradiction. Barring other arguments, we can assume the uncertainty equals a really small fraction.
E.g., suppose a guy comes out tomorrow with a proof of the Riemann Hypothesis. What are the chances he is wrong? Surely not zero.
But the chance that the Riemann Hypothesis itself is wrong, if it has a proof? Well, that kinda seems like zero. (But then, how would we know that? It does seem like we have to filter through our unreliable senses.)
Hrmm… I’m still taking high school geometry, so “infinite set of axioms” doesn’t really make sense yet. I’ll try to re-read that thread once I’ve started college-level math.
“But once I assign a probability of 1 to a proposition, I can never undo it. No matter what I see or learn, I have to reject everything that disagrees with the axiom. ”
I think this is what causes the religious argument paradox. On a deep down level, most of us realize this is true.
Why would a rational human agent even WANT infinite certainty? It’s inherently pathological.
OCD checkers feel a general and relatively strong need to be certain about the veracity of recollections and that they have high standards for memory performance. This may explain earlier findings that OCD checkers have a general tendency to distrust their episodic memory. A need for certainty and a critical attitude towards memory performance may not be problematic or abnormal. It is suggested that clinical problems arise when the patient tries to fight memory distrust by repeated checking. The latter does not reduce distrust but rather increases distrust and the patient may get trapped in a spiral of mutually reinforcing checking behavior and memory distrust.
It’s not at all hard for a mathematician to come up with arbitrarily large numbers of statements that have about the same confidence as 2+2=4. There’s lots of ways. Perhaps the most obvious is “n+2 = (n+1)+1” for arbitrary large n a whole number. It’s rather silly to talk about how many lifetimes it would take to say these statements because there they are in 2 seconds.
I suppose the anticipated response would be to question whether these are independent statements. Why would they not be? If we are anticipating that 2+2 may not be 4 I don’t see how we can for certainty say that any similar statement in arithmetic would imply any other. But perhaps it would be clearer if I changed the formula for the statements to be this: “2+2 is not equal to n”, for arbitrarily large n, a whole number greater than 4. Of course this is no real difference except it now looks a lot like an argument for saying that a 1 in a million probability is sensible in cases where you have 1 million easily enumerated cases.
For example say I claim that the chance of winning a lottery by guessing a 6 digit number is 1 in a million. By the logic of the article this is a preposterous, egotistical notion unless I can come up with a million or so other statements of similar confidence. Easy enough. “the winning number is n” for each number 1 through 1 million. I think this has been used as an example in another article somewhere. These 1 million statements have a correspondence with similar statements like “2+2 is not 5″ etc. What is 2+2? is it 4? is it 5? is it 6? etc If the lottery example counts as “independent” statements then so does the 2+2 series. And if they do not then are we saying it’s egotistical to demand you know what the probability of the lottery is?
Incidentally the lottery example isn’t a set of independent statements in the probability sense. Knowing if one statement is true or false gives me information about all the others. eg if you tell me the winning number is 1 then I know it’s not 2. So what is the meaning of the word “independent” when asking for independent statements in this article? It seems to be some vague sense of not having much to do with each other somehow. Is it ever possible to have a large number of statements like this?
In the previous essay in this series evidence acceptable for thinking 2+2=3 was discussed. One example was that the person might be hypnotized. To me that seems like the only realistic explanation but certainly it’s a likely one. That’s great but if you’ve been hypnotized to think 2+2=3 like that isn’t it suddenly much more likely that you might have been hypnotized to thing any number of other similar confidence level statements are true? So doesn’t this challenge the real independence of all those supposedly independent statements of similar confidence you might have made?
It seems like this word “independent” is a problem within the article.
I’m 99 percent sure that the statement “consciousness exists/is” has a PROBABILITY 1 at being true. All of the specificities we associate with it certainly do not, but that fact that something is experiencing something seems irrefutable. Can someone concoct a line of reasoning that would prove this wrong, say similar to 2 + 2 = 3
I’m not sure what PROBABILITY means the way you’re using it.
Are dogs conscious? Ants? Plants?
In the case of consciousness, this does seem valid (to me), to the extent that something I don’t understand well enough to create, can be said to exist.* However, not everything people say about their experience should be taken without some salt—the literature on biases (replications aside) claims that 1) there are ways to manipulate people’s decisions where 2) they claim said thing which ‘had a measurable effect’ had no effect.
*That is, if we’re not conscious, then what would consciousness mean? The difficulty of ruling whether this applies, or to what degree it does, is however, less clear.
The Banach Tarski Paradox is a plausible way in which 1 = 2, and thus 3 = 2 + 2.
I agree that you can never be „infinitly certain“ about the way the physical world is (because there‘s always a very tiny possibility that things might suddenly change, or everything is just a simulation, or a dream, or […] ), but you should assign probability 1 to mathematical statements for which there isn‘t just evidence, but actual, solid proof.
Suppose you have the choice beetween the following options: A You get a lottery with a 1-Epsilon chance of winning. B You win if 2+2=4 and 53 is a prime number and Pi is an irrational number.
Is there any Epsilon>0 for which you would chose option A? What if something really bad happens if you lose (like, all of humanity being tortured for [insert large number] years)?
I would chose option B for any Epsilon>0, which means assigning Bayes-probability 1 to option B.
You might want to see How to Convince Me That 2 + 2 = 3
Even if you believe that mathematical truths are necessarily true, you can still ask why you believe that they are necessarily true. What caused you to believe it? Likely whatever process it is is fallible.
I’ll quote you what I commented elsewhere on this topic:
I realize I haven’t engage with your Epsilon scenario. It does seem pretty hard to imagine and assign probabilities to, but actually assigning I seems like a mistake.
Assigning Bayes-probabilities <1 to mathematical statements (that have been definitly proven) seems absurd and logically contradictory, because you need mathematics to even asign probabilities.
If you assign any Bayes probability to the statement that Bayes probabilities even work, you already assume that they do work.
And, arguably, 2+2=4 is much simpler than the concept of Bayes-probability (To be fair, the same might not be true for my most complex statement that Pi is irrational)
The link in footnote 2 is dead.
The link to Peter de Blanc is dead, try https://web.archive.org/web/20160305092845/http://www.spaceandgames.com/?p=27
I just had a click moment, and click moment should be shared so here I go.
I was thinking—why shouldn’t I be able to make 10,000 statements similar to 2+2=4 and get them all right? 1,000,000 even? 1,000,000,000? Any arbitrary N∈N?
All I have to do is come up with simple additions of different numbers, and since it’s all math and they are all tautologies there is no reason why I can’t get be right on all of them. Or is it?
So the obvious reason is that it takes time, and my life are limited. Once I’m dead, I can’t make any more statements. But… is this really a valid reason? Why should my death, in the future, affect the the confidence I put in 2+2=4 now?
So, let’s assume for the sake of the argument that I’m going to live forever. Or at least until I can come up with all the statements I need to come up with.
Next problem—I’m going to get tired. 16 hours a day? For years? I don’t think I can talk straight for more than an hour!
Since I assumed myself immortality I can also assume myself infinite stamina, but there is an easier way to solve this—just use my immortality. Eliezer used 16 hours a day and 20 seconds per statement to give a feeling of how large these numbers are, but since time is not an issue I can just do one statement a day. Eventually I’ll hit whatever arbitrary quota I need to hit.
And here we reach the problem that made it click—while there is no limit to the number of operand I can put in my additions, the number of small operands sure is limited. And by “small” I mean representation - 2 and 6 is larger than 0.003464326436432662364326 and 0.04326432632626243665432, but the former are much easier to add up than the later.
So, eventually I’ll run out of additions that involve only simple numbers, and have to use at least one operand with ten digits. Later on, hundred digits. Thousand digits!O(log10N) digits, but N is unbounded…
I am not 100% confident I can do math with these numbers and never make a wrong calculation.
Sure, I can write it down, and reduce the chance of error—but not to zero. And I can double check and triple check, but since no single check has 100% probability of finding all potential mistakes, the combination of all checks can’t do that either.
Intuitively the more digits there are the more likely I am to err. I’m more confident in my ability to add numbers with 100 digits than in my ability to add numbers with 200 digits. And I’m even more confident in my ability to add numbers in 50 digits. So generally speaking, I’m more confident in my ability to add numbers with n digits than in my ability to add numbers with n+1 digits.
But there is no n that I’m 100% confident in my ability to add numbers in n digits but less than 100% confident in my ability to add numbers with n+1 digits.
So why should I assign a zero probability to me butchering the addition of single digit numbers?
Imagine you set up a program that will continually resolve 2 + 2 after your death. Perhaps it will survive much longer than entropy will allow us to survive. It has a very nice QPC timer.
It uses binary, of course. After all, you can accomplish binary with some simple LEDs, or just, dots. Little dots. So you accomplish your program, set it to run using the latest CMBR-ran entropic technology, and no one attends your funeral, because you are immortal, but immortality does not survive entropy. At least, within the same uncertainty as you failing to state 2 + 2 = 4. Your brain remains remarkably logical through this. After all, it is highly overgeared, now, having been immortal. You are the 2 + 2 master equivalent of Ronnie O’Sullivan. Flawed, yes, but goodness, you can play a mean game of snooker. Sometimes you even get sneaky, and throw in a 4 = 2 + 2.
Your entropic death approaches. You write the code, and having made sure₁ of it, you set your canary to alert it of your death- the moment you fail to accomplish the scheduled 2 + 2 = 4 which continues the cosmic clock the universe’s entropy.
It is a simple equation. It takes very few bits to accomplish. 10 + 10 = 100₂ ~
Oh. Wait. It’s now 10 + 10 = 100? That doesn’t fulfill our need for 2 + 2 = 4, since the semantics aren’t preset. Well, what if we try this?
• • + • • = • • • •
Ah, yes, back to something reasonable. There are two dots and two dots which indicate 4 dots. This makes sense, possibly.
But the question wasn’t to resolve • • + • • ; but 2 + 2. So, in the spirit of integrity, we need to convert it to a displayable format for some future god-king race of the Anunaki to come witness our last, single work of humanity. Thus, you convert
• • + • • = • • • •
into
10 + 10 = 100
into
2 + 2 = 4
Since it only necessitates one character as the result, it is also the most efficient on power.
Are you 100% confident in anything aside from yourself, even if you made it?
Because you made 2 + 2 = 4. It’s just your idea. At that later immortal point in time, you are the only thing that still thinks 2 + 2 is relevant. The free floating quarks can barely find a date, let alone double :wink: date.
And this has at least three points of failure.
₁The code has at least fifteen points of failure.
₂This has at least eight points of failure.
Much like how I cannot assign a probability of 1 to my brain for any task, no matter how simple, I cannot assign a probability of 1 to a fallible CPU, no matter the quality. It could very well be the computer in Hitchhiker’s. I’ve been wrong on too many easy simple things by accident more than once to realize this.
I have suddenly become mildly interested in investigating an edge case of this argument. I am not coming at this from the perspective of defending the statement of infinite certainty, it is only useful in certain nonsense-arguments. I just found it kinda fun, and maybe an answer would improve my understanding of the rhythm behind this post.
So, let’s suppose you have a statement so utterly trivial and containing so little practical sense you wouldn’t even think of it as a worthwhile statement, for example “A is A”. Now, this is a bad example, because you can already see the nonsense incoming, but I’m not sure if there are any good ones. Let’s then go by the practical definition of certainty of 1-1/1000 -- you need to collect about 1000 statements on the same level of certainty but with different actual drivers behind them, say them out loud and be wrong once. The only problem is there are, like, 10 of these, ever. The other ones are too complex and can’t be put on the same team I-am-sure-of-this. So you can’t properly measure if you are sure of this even on the scale of 90%. Technically, this may pass as a candidate to be 1.0 certainty, by the “say and be wrong rarely” definition, because there ever are only this much of such trivial statements, and the probability of you being correct on all of them, no matter how many times you repeat, is substantial.
I also don’t think it would bother me much if I was stripped of possibility of changing my mind about “A is A”.