Deleting paradoxes with fuzzy logic
You’ve all seen it. Sentences like “this sentence is false”: if they’re false, they’re true, and vice versa, so they can’t be either true or false. Some people solve this problem by doing something really complicated: they introduce infinite type hierarchies wherein every sentence you can express is given a “type”, which is an ordinal number, and every sentence can only refer to sentences of lower type. “This sentence is false” is not a valid sentence there, because it refers to itself, but no ordinal number is less than itself. Eliezer Yudkowsky mentions but says little about such things. What he does say, I agree with: ick!
In addition to the sheer icky factor involved in this complicated method of making sure sentences can’t refer to themselves, we have deeper problems. In English, sentences can refer to themselves. Heck, this sentence refers to itself. And this is not a flaw in English, but something useful: sentences ought to be able to refer to themselves. I want to be able to write stuff like “All complete sentences written in English contain at least one vowel” without having to write it in Spanish or as an incomplete sentence.1 How can we have self-referential sentences without having paradoxes that result in the universe doing what cheese does at the bottom of the oven? Easy: use fuzzy logic.
Now, take a nice look at the sentence “this sentence is false”. If your intuition is like mine, this sentence seems false. (If your intuition is unlike mine, it doesn’t matter.) But obviously, it isn’t false. At least, it’s not completely false. Of course, it’s not true, either. So it’s not true or false. Nor is it the mythical third truth value, clem2, as clem is not false, making the sentence indeed false, which is a paradox again. Rather, it’s something in between true and false—”of medium truth”, if you will.
So, how do we represent “of medium truth” formally? Well, the obvious way to do that is using a real number. Say that a completely false sentence has a truth value of 0, a completely true sentence has a truth value of 1, and the things in between have truth values in between.3 Will this work? Why, yes, and I can prove it! Well, no, I actually can’t. Still, the following, trust me, is a theorem:
Suppose there is a set of sentences, and there are N of them, where N is some (possibly infinite) cardinal number, and each sentence’s truth value is a continuous function of the other sentences’ truth values. Then there is a consistent assignment of a truth value to every sentence. (More tersely, every continuous function [0,1]^N → [0,1]^N for every cardinal number N has at least one fixed point.)
So for every set of sentences, no matter how wonky their self- and cross-references are, there is some consistent assignment of truth values to them. At least, this is the case if all their truth values vary continuously with each other. This won’t happen under strict interpretations of sentences such as “this sentence’s truth value is less than 0.5”: this sentence, interpreted as black and white, has a truth value of 1 when its truth value is below 0.5 and a truth value of 0 when it’s not. This is inconsistent. So, we’ll ban such sentences. No, I don’t mean ban sentences that refer to themselves; that would just put us back where we started. I mean we should ban sentences whose truth values have “jumps”, or discontinuities. The sentence “this sentence’s truth value is less than 0.5” has a sharp jump in truth value at 0.5, but the sentence “this sentence’s truth value is significantly less than 0.5″ does not: as its truth value goes down from 0.5 down to 0.4 or so, it also goes up from 0.0 up to 1.0, leaving us a consistent truth value for that sentence around 0.49.
Edit: I accidentally said “So, we’ll not ban such sentences.” That’s almost the opposite of what I wanted to say.
Now, at this point, you probably have some ideas. I’ll get to those one at a time. First, is all this truth value stuff really necessary? To that, I say yes. Take the sentence “the Leaning Tower of Pisa is short”. This sentence is certainly not completely true; if it were, the Tower would have to have a height of zero. It’s not completely false, either; if it were, the Tower would have to be infinitely tall. If you tried to come up with any binary assignment of “true” and “false” to sentences such as these, you’d run into the Sorites paradox: how tall would the Tower be if any taller tower were “tall” and any shorter tower were “short”? A tower a millimeter higher than what you say would be “tall”, and a tower a millimeter shorter would be “short”, which we find absurd. It would make a lot more sense if a change of height of one millimeter simply changed the truth value of “it’s short” by about 0.00001.
Second, isn’t this just probability, which we already know and love? No, it isn’t. If I say that “the Leaning Tower of Pisa is extremely short”, I don’t mean that I’m very, very sure that it’s short. If I say “my mother was half Irish”, I don’t mean that I have no idea whether she was Irish or not, and might find evidence later on that she was completely Irish. Truth values are separate from probabilities.
Third and finally, how can this be treated formally? I say, to heck with it. Saying that truth values are real numbers from 0 to 1 is sufficient; regardless of whether you say that “X and Y” is as true as the product of the truth values of X and Y or that it’s as true as the less true of the two, you have an operation that behaves like “and”. If two people have different interpretations of truth values, you can feel free to just add more functions that convert between the two. I don’t know of any “laws of truth values” that fuzzy logic ought to conform to. If you come up with a set of laws that happen to work particularly well or be particularly elegant (percentiles? decibels of evidence?), feel free to make it known.
1. ^ The term “sentence fragment” is considered politically incorrect nowadays due to protests by incomplete sentences. “Only a fragment? Not us! One of us standing alone? Nothing wrong with that!”
2. ^ I made this word up. I’m so proud of it. Don’t you think it’s cute?
3. ^ Sorry, Eliezer, but this cannot be consistently interpreted such that 0 and 1 are not valid truth values: if you did that, then the modest sentence “this sentence is at least somewhat true” would always be truer than itself, whereas if 1 is a valid truth value, it is a consistent truth value of that sentence.
The crisp portion of such a self-reference system will be equivalent to a Kripke fixed-point theory of truth, which I like. It won’t be the least fixed point, however, which is the one I prefer; still, that should not interfere with the normal mathematical reasoning process in any way.
In particular, the crisp subset which contains only statements that could safely occur at some level of a Tarski hierarchy will have the truth values we’d want them to have. So, there should be no complaints about the system coming to wrong conclusions, except where problematically self-referential sentences are concerned (sentences which are assigned no truth value in the least fixed point).
So; the question is: do the sentences which are assigned no truth value in Kripke’s construction, but are assigned real-numbered truth values in the fuzzy construction, play any useful role? Do they add mathematical power to the system?
For those not familiar with Kripke’s fixed points: basically, they allow us to use self-reference, but to say that any sentence whose truth value depends eventually on its own truth value might be truth-value-less (ie, meaningless). The least fixed point takes this to be the case whenever possible; other fixed points may assign truth values when it doesn’t cause trouble (for example, allowing “this sentence is true” to have a value).
If discourse about the fuzzy value of (what I would prefer to call) meaningless sentences adds anything, then it is by virtue of allowing structures to be defined which could not be defined otherwise. It seems that adding fuzzy logic will allow us to define “essentially fuzzy” structures… concepts which are fundamentally ill-defined… but in terms of the crisp structures that arise, correct me if I’m wrong, but it seems fairly clear to me that nothing will be added that couldn’t be added just as well (or, better) by adding talk about the class of real-valued functions that we’d be using for the fuzzy truth-functions.
To sum up: reasoning in this way seems to have no bad consequences, but I’m not sure it is useful...
By the way, how would you incorporate probabilities into binary logic? Either you can include statements about probabilities in binary logic (“probability on top of logic”), or you can assign probabilities to binary logic statements (“logic on top of probability theory”). The situation is just analogous to that of fuzziness. If you do #1, that means binary logic is the most fundamental layer. If you do #2, I can also do an analogous thing with fuzziness.
The rules of probability reduce to the rules of binary logic when the probabilities are all zero or one, so you get binary logic for free just by using probability.
Yes, we all know that ;)
But under this approach the binary logic is NOT operating at a fundamental level—it is subsumed by a probability theory. In other words, what is true in the binary logic is not really true; it depends on the probability assigned to the statement, which is external to the logic. In like manner, I can assign fuzzy values to a binary logic which are external to the binary logic.
It’s good that you pointed out Kripke’s fixed point theory of truth as a solution to the Liar’s paradox. It seems to be an acceptable solution.
On the other hand, I also agree that “fuzziness as a matter of degree” can be added on top of a binary logic. That would be very useful for dealing with commonsense reasoning—perhaps even indispensable.
What is particularly controversial is whether turth should be regarded as a matter of degree, ie, the development of a fuzzy-valued logic. At this point, I am kinda 50-50 about it. The advantage of doing this is that we can translate commonsense notions easily, and it may be more intuitive to design and implement the AGI. The disadvantage is that we need to deal with a relatively new form of logic (ie, many-valued logic) and its formal semantics, proof theory, model theory, deduction algorithms, etc. With binary logic we may be on firmer ground.
YKY,
The problem with Kripke’s solution to the paradoxes, and with any solution really, is that it still contains reference holes. If I strictly adhere to Kripke’s system, then I can’t actually explain to you the idea of meaningless sentences, because it’s always either false or meaningless to claim that a sentence is meaningless. (False when we claim it of a meaningful sentence; meaningless when we claim it of a meaningless one.)
With the fuzzy way out, the reference gap is that we can’t have discontinuous functions. This means we can’t actually talk about the fuzzy value of a statement: any claim “This statement has value X” is a discontinuous claim, with value 1 at X and value 0 everywhere else. Instead, all we can do is get arbitrarily close to saying that, by having continuous functions that are 1 at X and fall off sharply around X… this, I admit, is rather nifty, but it is still a reference gap. Warrigal refers to actual values when describing the logic, but the logic itself is incapable of doing that without running into paradox.
About the so-called “discontinuous truth values”, I think the culprit is not that the truth value is discontinuous (it doesn’t make sense to say a point-value is continuous or not), but rather that we have a binary predicate, “less-than”, which is a discontinuous truth functional mapping.
The statement “less-than(tv, 0.5)” seems to be a binary statement. If we make that predicate fuzzy, it becomes “approximately less than 0.5″, which we can visualize as a sigmoidal curve, and this curve intersects with the slope=1 line at 0.5. Thus, the truth value of the fuzzy version of that statement is 0.5, ie, indeterminate.
All in all, this problem seems to stem from the fact that we’ve introduced the binary predicate “less-than”.
I’d like to clear this up for myself. You’re saying that under Kripke’s system we build up a tower of meaningful statements with infinitely many floors, starting from “grounded” statements that don’t mention truth values at all. All statements outside the tower we deem meaningless, but statements of the form “statement X is meaningless” can only become grounded as true after we finish the whole tower, so we aren’t supposed to make them.
But this looks weird. If we can logically see that the statement “this statement is true” is meaningless under Kripke’s system, why can’t we run this logic under that system? Or am I confusing levels?
Call it “expected” truth, analagous to “expected value” in prob and stats. It’s effectively a way to incorporate a risk analysis into your reasoning.
Yes, I have worked out a fuzzy logic with probability distributions over fuzzy values.
I think the original post is not specific enough to be useful.
I see two essential moot points:
1) Why should be there a system of continuous correspondences between the truth values of sentences that have to do anything with some intuitive notion of truth values?
2) Are the truth values (of the sentences) after taking the fix point actually useful? E.g. can’t it be that we end up truth values of 1⁄2 for almost every sentence we can come up?
Before these points are cleared, the original post is merely an extremely vague speculation.
A closely related analogue to the second issue: in NP-hard optimization problems with lot of {0,1} variables, it’s a most common problem that after a continuous relaxation the system is easily (polynomially) solvable but the solution is worthless as a large fraction of the variables end up to be half which basically says: “no information”.
I can’t figure out what you’re trying to ask here.
I suppose the best answer I can give to this is “maybe”. If the logic operations that you use are 1-x, min(x,y), and max(x,y), and the sentences are entirely baseless (i.e. no sentence can be calculated independent of all the others), then a truth value of 1⁄2 for everything will always be consistent. If your sentences happen to actually form a hierarchy where sentences can only talk about sentences lower down, fuzzy logic will give a good answer.
The NP-hard optimization thing you cite is interesting; do you have a link?
Finally, in my defense, the purpose of this post was mainly to advocate for the use of fuzzy logic through the insight that it resolves paradoxes in a manner much more elegant than ordinal type hierarchy thingies, mentioning that fuzzy logic seems to be the only good way to deal with subjective things such as tallness and beauty anyway.
To 1):
I suspected, but was not sure whether you meant the standard min/max relaxation of logical operators. You could have had more elaborate plans (I could not rule out) that could have lead to unexpected interesting consequences, but this is highly speculative. An analogue again from combinatorial optimization: Moving away from linear (essentially min/max based) relaxations to semidefinite ones could non-trivially improve the performance of coloring and SAT-solving algorithms, at least asymptotically.
“The NP-hard optimization thing you cite is interesting; do you have a link?”
This is a very well known practical folklore knowledge in that area, not explicitly topic of publications, rather part of the introductory training. If you want to have a closer look, search for randomized rounding, which is well established technique and can yield good results for certain problem classes, but may flop for others, exactly due to the above mentioned dominance of fractional solutions (integer/decision-variables taking half(-integeter) values being the typical case.) E.g. undergraduate course materials on the traveling-salesman-problem have concrete examples of that issue occurring in practice.
Nitpick: Since this sentence doesn’t refer to the truth value of any English sentence, you’d still be able to write it even if you were using type hierarchies or the like. I think.
I might be missing something, but it seems as if you’re needlessly complicating the situation.
First of all, I’m not convinced that sentences ought to be able to self reference. The example you give, “All complete sentences written in English contain at least one vowel” isn’t necessarily self-referencing. It’s stating a rule whic is inevitably true, and which it happens to conform to. I could equally well say “All good sentences must at least one verb.” This is not a good sentence, but it does communicate a grammatical rule.
But none of this has a priori truth—they just happen to conform to accepted standards—and I don’t think they demonstrate the usefulness of self-referencing. English grammar allows you to self-reference, but defining “Cat (n): a cat” is a tautology. English also allows you to ask the question “What happened before time began?” and while that is a perfectly valid sentence, it’s a meaningless question.
As a corollary, mathematical notation allows me to write “2+2=5” (note—the person who writes this down isn’t claiming that 2+2=5, she is far better versed than Aurini in the reasons it equals 4, she is just demonstrating that she can write down nonsense). This doesn’t require a defense of arithmetic; it’s simple enough to point out that the equation is nonsense.
“This sentence is false.” “What happened before time?” “My pet elephant that I named George doesn’t exist.” I don’t see that a rebuttal is necessary, meaningful, or even possible in these situations. It’s enough to say “That’s stupid,” and move on to something interesting.
Warning, nitpicks follow:
The sentence “All good sentences must at least one verb.” has at least one verb. (It’s an auxiliary verb, but it’s still a verb. Obviously this doesn’t make it good; but it does detract from the point somewhat.)
“2+2=5” is false, but it’s not nonsense.
On the topic of fuzzy logic: Is there a semantics for fuzzy logic in which the fuzzy truth value of the statement “predicate P is true of object x” is the expected value of P(x) after marginalizing out a prior belief distribution over possible hidden crisp definitions of P?
I guess not. The point is that “matters of degree” are inherently different from probabilities, and the former cannot be reduced to the latter. To best clarify this point, we need a formal semantics of fuzzy logic (where fuzziness is treated as matters of degree). I’m not sure if there’s such research in the literature, I’ll have a look when I have time...
I’m not sure they are inherently different. I read Kosko’s popular book on Fuzzy Logic many years ago and can’t remember the details of the argument, but he claimed that probabilistic logic is a special case of fuzzy logic, as propositional logic is a special case of probabilistic logic (ie, with probabilities of 0 and 1).
Several things. First, you’re claiming “probabilistic is a special case of fuzzy” but that does not imply “fuzzy is a special case of probabilistic” which was the original point of contention.
Secondly, you probably have confused fuzzy logic with “possibility theory”. There can be many types of fuzzy logic, and the issue we’re currently debating is whether “truth” can be regarded as a matter of degree, ie, fuzziness as degree of truth. Possibility theory is a special type of fuzzy theory which results from giving up the probability axiom #3, “finite additivity”. That is probably what your author is referring to.
Yes, because probabilistic logic is a special case of fuzzy logic. (The phrase “special case” is odd, because you could argue that it’s simply the correct case, and all others are wrong.)
There should be a way to break this system. Let’s see...
“This sentence doesn’t have a consistent truth value.”
Did I win?
No. The truth value of this sentence is 0.
In your attempt to make a sentence behaving a certain way, you made a sentence simply describing its behavior instead of one that actually behaves in the required manner. I nearly did that while writing this: after I wrote “this sentence is a little bit truer than itself”, it took me what seemed way too long to come up with “this sentence is at least somewhat true”.
So it’s false. Which makes it self-contradictory, and therefore false.
i think.
This makes it sound like you have indeed just reintroduced types under another name, patching “this statement is false” by forbidding “this statement has truth value 0.0″.
I think “this statement has truth value 0” is allowed.
ADDED: It has no discontinuity. It is the iterated system f(S) = 1 - S, and its fixed point is at S = .5.
But it manifestly has a discontinuity. Would you prefer the equivalent “this statement’s truth value is less than epsilon (epsilon some infinitesimal)”?
Why does it have a discontinuity?
Folks, you shouldn’t vote down legitimate questions.
Relevantly, because it’s structurally identical to Warrigal’s sample sentence, so whatever definition Warrigal is using (a perfectly standard one, it seems to me) must apply to both.
It’s structurally identical to a sample sentence that Warrigal used in describing a different approach, not the one he/she is taking.
If it manifestly has a discontinuity, you should be able to say where it is.
(In fact, it does not have a discontinuity. For not(S) = 1-S, it is completely linear: it is the iterated system f(S) = 1-f(S), having a fixed point at .5. )
Okay, this is getting silly. Warrigal says “The sentence ‘this sentence’s truth value is less than 0.5’ has a sharp jump in truth value at 0.5, but the sentence ‘this sentence’s truth value is significantly less than 0.5’ does not [and we will ban the first form]”. In the same way, my sentence “This sentence’s truth value is less than epsilon” has a discontinuity at epsilon. Both sentences make discontinuous claims about their truth values.
What is the “different approach” that you claim this sentence is in reference to?
(Incidentally, I agree with you that my sentence has a fixed point at 0.5 under Warrigal’s system. That’s why my original comment was criticizing the presentation and not necessarily the content of the theory.)
The sentence we were discussing was “This statement has truth value 0”. I assumed that when you said it was structurally identical to Warrigal’s sample sentence, you were referring to this passage:
That sentence refers to the traditional ways around Russell’s paradox.
You seem to say discontinuity when you mean a noncontinuous first derivative.
It occurs to me as a “Notion” that . . .
To formulate fuzzy logic in a boolean top domain environment, I think you would need to use a probabilistic wave form type explanation. Or else just treat fuzzy logic as a conditional multiplier on any boolean truth value. To encapsulate a boolean or strict logic system into fuzzy logic is trivial and evolving. You could start with just adding a percentage based on some complex criteria to any logical tautology or contradiction. By default the truth axis of a fuzzy logic decision or logic tree is going to be knows for some classes of logic systems. When used for making real world decisions in the context of taking action in a decision or vote, the “relevance” value of a fuzzy logic based decision branch would be 0% relevant for a “contradiction logic” and 100% relevant for a “tautology logic”? So in the real world we don’t consider contradictions when we humans use fuzzy logic to decide on a course of action. the default truth and relevance value of any contradiction in Fuzzy logic is zero until voted otherwise or adjusted through some mechanism.
Or maybe this doesn’t make sense? sorry if this post is a little confused and I haven’t thought about these ideas until just now for this discussion. Thanks for your time and let me know if it wasn’t worth your time if it bothered you please. I don’t want to be a bother so just ask and I’ll go away if you prefer. Thx.
If all true statements are defined as non-contradictory, then you can ask more meaningful fuzzy logic questions about the relevance of several tautologies for applying to a specific real world phenomena. To do this you need a survey or poll of the environment and a survey or poll for determining how much the teutologies matter. For example.
Consider the following boolean true false claims we hold to be true and consider their relevance for use in locating humans statistically: our first rule or fuzzy logic heuristic is to take the first tautology that seems relevant and apply it to see if it matches results.
For example consider these specefic logic systems:
1) The complete theory of gravity as discovered by Neuton determines that statistically humans have mass and density approximately equal to water. Gravity combined with density predicts that humans should be located in a region of space centered on the gravitational center of the planet and evenly distributed in a sphere with all air above every human and all solid matter below humans. Evidence of humans living underground or flying on airplanes above any air or with air separating humans from the center of gravity is a violation of this theory of gravity and density and random distribution mathematics.
2) The incomplete theory of plate tectonics and geography determines that people in some places their will be air closer to the center of gravity than at other places. The idealized sphere distribution of humans has bumps and valleys caused by plate techtonics asserts that some humans on mountain tops will be at local equilibrium above other air molecules in valleys. Humans at one latitude and longitude can be above the air in some other latitude and longitude.
3) The incomplete theory of human behavior says that humans can move and defy uniform distribution rules about their statistical probabilistic location relative to the center of gravity. People go into rocket ships and can even be found above the air which is in complete contradiction to the theory of gravity considered as the only teutology theory of relevance.
4) The theory of geometry and angular momentum combined with gravity proves conclusively that humans must be located exclusively in a squished sphere shape distribution (oblate) with their distance from the the center of gravity determined solely by their relative density compared to the rest of the material in the the planetary space under consideration.
Conclusion: not all of these verifiable true boolean statements are equally valuable and equally relevant. Some of them can be discarded or are more usably incomplete than others. Utility value of any logic system is determined by the use case and boolean logic components can be added and removed from consideration and time consuming calculations based on their predictive ability for the particular use case. To determine, in this example, the most relevant and important logic systems I would like anyone who reads this to rank order the logic system choices from most relevant and useful to least relevant and useful. The distribution of your rank order voting will determine the utility value of the the multiple non-contradictory tautology logic systems comparatively. You may also add a 1 (one) new option to this voting poll on relevance and other can rank your additional logic framework. When we have enough votes we start evolving and deleting logic systems from our poll until we have a high level of agreement or a stable equilibrium of voting distribution.
Boolean logic tells us what is possible. Fuzzy logic tells us what is relevant and usable.
Why?
A conjecture (seems easy to prove):
“If, in a fuzzy logic where truth values range from [0,1], we allow logical operators (which are maps from [0,1] to [0,1]) or predicates that does not intersect the slope=1 line, then we can always construct a Liar’s Paradox.”
An example is the binary predicate “less-than”, which has a discontinuity at 0.5 and hence does not intersect the y=x line.
Nitpick: Since this sentence doesn’t refer to the truth value of any English sentence, you’d still be able to write it even if you were using type hierarchies or the like.
A conjecture (seems easy to prove):
“If, in a fuzzy logic where truth values range from [0,1], we allow logical operators (which are maps from [0,1] to [0,1]) or predicates that does not intersect the slope=1 line, then we can always construct a Liar’s Paradox.”
An example is the binary predicate “less-than”, which has a discontinuity at 0.5 and hence does not intersect the y=x line.
O...kay. It looks like you just decided to post the first thing on your head without concern for saying anything useful.
You come up with fractional values for truth, but don’t think it’s necessary to say what a fractional truth value means, let alone formalize it.
You propose the neato idea to use fractional truth values to deal with statements like “this is tall”, and boost it with a way to adjust such truth values as height varies. Somehow you missed that we already have a way to handle such gradations; it’s called “units of measurement”. We don’t need to say, “It’s 0.1 true that a football field is long”; we just say, “it’s true that a football field is 100 yards long.
Anyway, I thought I’d use this opportunity to say something useful. I was just reading Gary Drescher’s Good and Real (discussed here before), where he gives the most far-reaching, bold response to the claim that Goedel’s theorem proves limitations to machines, and I’m surprised the argument doesn’t show up more often, and that he didn’t seem to have anyone to cite as having made it before.
It goes like this: people claim that formal systems are somehow limited in that they can’t “see” that Goedel statements of the form “This statement can’t be proven within the system” are true. Drescher attacks this at the root and says, that’s not a limitation, because the statement’s not true.
He explains that you can’t actually rule out falsehood of the Goedel statement, as many people immediately do. Because it’s falsity still leaves room for the possibility that “This statement has a proof, but it’s infinitely long.” But then the subtle assumption that “This statement has a proof” implies “This statement is true” becomes much more tenuous. It’s far from obvious why you must accept as true a statement whose proof you can never complete.
Take that, Penrose!
Silas, a suggestion which you can take or leave, as your prefer.
This comment makes some sound points, but IMHO, in an unnecessarily personal way. Note the consistent use of the critical “you”-based formulations (“you just decided”, “you come up with”, “you propose”, “you missed that”). Contrast this with Christian’s comment, which is also critical, but consistently focuses on the ideas, rather than the person presenting them.
I have no idea why you feel the need to throw about thinly-veiled accusations that Warrigal is basically an idiot. (How else could he or she possibly have missed all these really obvious problems you so insightfully spotted?). Maybe you don’t even intend them as such (though I’m baffled as to how could you possibly miss the overtones of your statements when they’re so freakin’ OBVIOUS). But the tendency to belittle others’ intellectual capacities (rather than just their views) is one that you’ve exhibited on a number of prior occasions as well, and one that I think you would do well to try to overcome—if only so that others will be more receptive to your ideas.
PS. For the avoidance of doubt, that final para was intended in part as an ironic illustration of the problem. I’m not that un-self-aware.
PPS. Also, I didn’t vote you down.
I agree that I’ve been many times unnecessarily harsh. But seriously, take a look at a random sampling of my posts and see how many of them are that way. It’s not actually as often as you’re trying to imply.
I do it because some people cross the threshold from “honest mistake” into “not even trying”. In which case they need to know that too, not just the specifics of their error. Holding someone’s hand through basic explanations is unfair to the people who have to do the work that the initial poster should have done for themselves.
And FWIW, if anyone ever catches me in that position—where I screw up so bad that I didn’t even appear to be thinking when I posted—I hope that you treat me the same way, so that I learn not just my specific error, but why it was so easily avoidable. Arguably, that’s the approach you just took.
Now a suggestion for you: your comment was best communicated by private message. Why stage a degrading, self-congratulatory “intervention”? Unless...
What’s obvious to one person is seldom obvious to everybody else. There are things that seem utterly trivial to me that lots of people don’t get immediately, and many more things that seem utterly trivial to others that I don’t get immediately. That doesn’t mean that any of us aren’t trying, or deserve to be belittled for “not getting it”. (I can’t quite tell if your second paragraph is intended as justification or merely explanation; apologies if I’ve guessed wrongly).
It wasn’t intended to be self-congratulatory; it was intended to make a point. Oh well. As for being degrading, I was attempting, via irony, to help you to understand the impact of a particular style of comment. It’s a style that I would normally try to avoid, and I agree that in general such comments might be better communicated privately, and certainly in a less inflammatory way. (In this case, it honestly didn’t occur to me to send a private message. Not sure what I would have done if it had. I think the extent to which others’ here agree or disagree with my point is useful information for us both, but information that would be lost if the correspondence were private.)
I’m not sure what you think I was trying to imply, but I had two specific instances in mind (other than this one), and honestly wasn’t trying to imply anything beyond that.
You’re preaching to the choir here. But when Warrigal announces some grand new idea, but just shrugs of even the importance of spelling out its implications, that’s well beyond “not noticing something that’s obvious to others” and into the territory of “not giving a s---, but expecting people to do your work for you.”
Right. I “got” that the first time around (even before PS), thanks. That wasn’t what I was referring to as “degrading”; it was actually pretty clever. Good work!
The degrading bit was where you do the internet equivalent of calling someone out in public, and then going through your accumulated list of their flaws, so anyone else who doesn’t like the resident “bad guy” (guy who actually says what everyone else isn’t willing to take the karma hit for) can join the pile-on.
Sure, because what you were trying to accomplish (self-promotion, “us vs. them”)wouldn’t have been satisfied by a private message, so of course it’s not going to occur to you.
Other people seem to manage to PM me when I’m out of line (won’t name names here). But that’s generally because they’re actually interested in improving my posting, not in grandstanding.
I see no “accumulated list of [your] flaws” in what conchis has posted here. I see some comments on what you said on this particular occasion; and I see, embedded in something that (as you say you understood, and I’m sure you did) was deliberately nasty in style in order to make a point, the claim that you’ve exhibited the same pathology elsewhere as is on display here. No accumulated list; a single flaw, and even that mentioned only to point up the distinction between criticizing what someone has written and criticizing them personally.
Also: You’re being needlessly obnoxious; please desist. I am saying this in public rather than by PM because what I am trying to accomplish is (some small amount of) disincentive for other people who might wish to be obnoxious themselves. I am interested in improving not only your posting but LW as a whole.
And, FWIW, so far as I can tell I have no recollection of your past behaviour on LW, and in particular I am not saying this because I “don’t like” you.
I’m willing to apologise for publicly calling you out. While I’m still not totally convinced that PMing would have been optimal in this instance, it was a failing on my part not to have considered it at all, and I’m certainly sorry for any hurt I may have caused.
I’m also sorry that you seem to have such a poor impression of me that you can’t think of any way to explain my behaviour other than self-promotion and grandstanding. Not really big on argumentative charity are you?
Apology accepted! :-)
I apologize for loading up on the negative motives I attributed to you. I appreciate your feedback, I would just prefer it not be done in a way that makes a spectacle of it all.
Apology likewise accepted! ;)
He cites “Goedel, Escher, Bach”, in which Hofstadter makes the same argument. Hofstadter doesn’t apply it to the silly why-we-aren’t-machines argument, though. (And Drescher doesn’t actually say that a Goedel sentence isn’t true, just that we can’t really know it’s true.)
An infinitely long proof is not a proof, since proofs are finite by definition.
The truth value of a statement does not depend on the existence of a proof anyways, the definition of truth is that it holds in any model. It is just a corollary of Goedel’s completeness theorem that syntactic truth (existence of a (finite) proof) coincides with semantic truth if the axiom system satisfies certain assumptions.
With that definition of truth, a Goedel sentence is not “true”, because there are models in which it fails to hold; neither is its negation “true”, because there are models in which it does. But that’s not the only way in which the word “true” is used about mathematical statements (though perhaps it should be); many people are quite sure that (e.g.) a Goedel sentence for their favourite formalization of arithmetic is either true or false (and by the latter they mean not-true). There’s plenty of reason to be skeptical about the sort of Platonism that would guarantee that every statement in the language of (say) Principia Mathematica or ZF is “really” true or false, but it hardly seems reasonable to declare it wrong by definition as you’re doing here.
Those people seem a bit silly, then. If you say “The Godel sentence (G) is true of the smallest model (i.e. the standard model) of first-order Peano Arithmetic (PA)” then this truth follows from G being unprovable: if there were a proof of G in the smallest model, there would be a proof of G in all models, and if there were a proof of G in all models, then by Godel’s completeness theorem G would be provable in PA. To insist that the Godel sentence is true in PA—that it is true wherever the axioms of PA are true—rather than being only “true in the smallest model of PA”—is just factually wrong, flat wrong as math.
This thread needs a link to Tarski’s undefinability theorem.
Also, you’re assuming the consistency of PA.
The people I’m thinking of—I was one of them, once—would not say either “G is true in PA” or “G is true in such-and-such a model of PA”. They would say, simply, “G is true”, and by that they would mean that what G says about the natural numbers is true about the natural numbers—you know, the actual, real, natural numbers. And they would react with some impatience to the idea that “the actual, real, natural numbers” might not be a clearly defined notion, or that statements about them might not have a well-defined truth value in the real world.
In other words, Platonists.
I think most people who know Goedel’s theorem say “G is true” and are “unreflective platonists,” by which I mean that they act like the natural numbers really exist, etc, but if you pushed them on it, they’d admit the doubt of your last couple of sentences.
Similarly, most people (eg, everyone on this thread), state Goedel’s completeness theorem platonically: a statement is provable if it is true in every model. That doesn’t make sense without models having some platonic existence. (yes, you can talk about internal models, but people don’t.) I suppose you could take the platonic position that all models exist without believing that it is possible to single out the special model. (Eliezer referred to “the minimal model”; does that work?)
You are right: you may come up with another consistent way of defining truth.
However, my comment was a reaction to silas’s comment, in which he seemed to confuse the notion syntactic and semantic truth, taking provability as the primary criterion. I just pointed out that even undergraduate logic courses treat semantic truth as basis and syntactic truth enters the picture as a consequence.
Units of measurement don’t work nearly as well when dealing with things such as beauty instead of length.
Then neither does fuzzy logic.
I think an important distinction between units of measurement and fuzzy logic is that units of measurement must pertain to things that are measurable, and they must be objectively defined, so that if two people express the same thing using units of measurement, their measurements will be the same. I see no reason that fuzzy logic shouldn’t be applicable to things that are simply a person’s impression of something.
Or perhaps it would be perfectly reasonable to relax the requirement that units of measurement be as objective as they are in practice. If Helen of Troy was N standards of deviation above the norm in beauty (trivia: N is about 6), we can declare the helen equal to N standards of deviation in beauty, and then agents capable of having an impression of beauty could look at random samples of people and say how beautiful they are in millihelens.
If there’s a better way of representing subjective trueness than real numbers between 0 and 1, I imagine lots of people would be interested in hearing it.
That’s still creating a unit of measurement, it just uses protocols that prime it with respect to one person rather than a physical object. It doesn’t require a concept of fractional truth, just regular old measurement, probability andinterpolation.
Why don’t you spend some time more precisely developing the formalism… oh, wait
That’s why.
I don’t think it’s fair to demand a full explanation of a topic that’s been around for over two decades (though a link to an online treatment would have been nice). Warrigal didn’t ‘come up with’ fractional values for truth. It’s a concept that’s been around (central?) in Eastern philosophy for centuries if not millenia, but was more-or-less exiled from Western philosophy by Aristotle’s Law of the Excluded Middle.
Fuzzy logic has proven itself very useful in control systems and in AI, because it matches the way people think about the world. Take Hemingway’s Challenge to “write one true [factual] sentence” (for which you would then need to show 100% exact correspondence of words to molecules in all relevant situations) and one’s perspective can change to see all facts as only partially true. ie, with a truth value in [0,1].
The statement “snow is white” is true if and only if snow is white, but you still have to define “snow” and “white”. How far from 100% even reflection of the entire visible spectrum can you go before “white” becomes “off-white”? How much can snow melt before it becomes “slush”? How much dissolved salt can it contain before it’s no longer “snow”? Is it still “snow” if it contains purple food colouring?
The same analysis of most concepts reveals we inherently think in fuzzy terms. (This is why court cases take so damn long to pick between the binary values of “guilty” and “not guilty”, when the answer is almost always “partially guilty”.) In fuzzy systems, concepts like “adult” (age of consent), “alive” (cryonics), “person” (abortion), all become scalar variables defined over n dimensions (usually n=1) when they are fed into the equations, and the results are translated back into a single value post-computation. The more usual control system variables are things like “hot”, “closed”, “wet”, “bright”, “fast”, etc., which make the system easier to understand and program than continuous measurements.
Bart Kosko’s book on the topic is Fuzzy Thinking. He makes some big claims about probability, but he says it boils down to fuzzy logic being just a different way of thinking about the same underlying math. (I don’t know if this gels with the discussion of ‘truth functionalism’ above) However, this prompts patterns of thought that would not otherwise make sense, which can lead to novel and useful results.
I voted up your post for its conclusions, but would request that you make them a bit friendlier in the future...
Fuzzy logic is just sloppy probability, although Lofti Zadeh doesn’t realize it. (I heard him give a talk on it at NIH, and my summary of his talk is: He invented fuzzy logic because he didn’t understand how to use probabilities. He actually said: “What if you ask 10 people if Bill is tall, and 4 of them say yes, but 6 of them say no? Probabilities have no way of representing this.”)
You can select your “fuzzy logic” functions (the set of functions used to specify a fuzzy logic, which say what value to assign A and B, A or B, and not A, as a function of the values of A and B) to be consistent with probability theory, and then you’ll always get the same answer as probability theory.
The rules for standard probability theory are correct. But “sloppy” fuzzy-logic probability functions, like “A or B = max(A,B); A and B = min(A,B); not(A) = 1-A”, have advantages when Bayesian logic gives lousy results. Here are 2 situations where fuzzy logic outperforms use of Bayes’ law:
You have incomplete or inaccurate information. Say you are told that A and B have a correlation of 1: P(A|B) = P(B|A) = 1. By Bayes’ law, P(A^B) = P(AvB) = P(A) = P(B). Then you’re told that P(A) and P(B) are different. You’re then asked to compute P(A^B). Bayes law fails you, because the facts you’ve been given are inconsistent. Fuzzy logic is a heuristic that lets you plow through the inconsistency: it enforces p(AvB) >= p(A^B), when Bayes’ law just blows up.
You are a robot, making a plan. For every action you take, you have a probability of success that you always associate with that action. You assume that the probability of success for each step in a plan is independent of the other steps. But in reality, sometimes they are highly correlated. Because you assume probabilities are independent, you strongly favor short plans over long plans. Using fuzzy logic allows you to construct longer plans.
Fuzzy logic is just a pragmatic computational tool. Nothing that’s going to help you get around a paradox, except in the sense that it will let you construct a model that’s inaccurate enough that the paradox disappears from sight.
When you switch to using these numbers to differentiate between “short” and “extremely short”, that’s not probability. But then you’re no longer talking about truth values. You’re just measuring things. The number 17 is no more true than the number 3.
All that said, the approach you just described is interesting. I’m missing something, but it’s very late, so I’ll have to try to figure it out tomorrow.
You may have misunderstood what Zadeh was saying. Suppose Bill is 5 feet, 9 inches in height and all ten people know this. I.e. we are not attempting to represent the likelihood that Bill is or is not tall based on the uncertain evidence given by different people. It is not 60% likely that Bill is tall, and 40% likely that he is not. He is 5 feet, nine inches and everyone knows it. No one disagrees on his actual, measured height.
Now we could taboo the word tall, and we wouldn’t lose any information; and in some contexts that might be the right thing to do. However in practical, day-to-day life humans do use words like tall that have fuzzy, non-crisp boundaries. The truth value of a word like “tall” is better expressed as a real number than a boolean value. Fuzzy logic represents the apparent disagreement on whether or not Bill is tall by saying he is 60% tall and 40% not tall.
That isn’t what distinguishes fuzzy logic from probabilities. Both would represent this case with the number 0.6. The distinguishing feature of fuzzy logic is that it uses non-probabilistic functions to compute joint probabilities, to avoid various practical and computational problems.
How do you do this? As far as I understand, it is impossible since probability is not truth functional. For example, suppose A and B both have probability 0.5 and are independent. In this case, the probability of ‘A^B’ is 0.25, while the probability of ‘A^A’ is 0.5. You can’t do this in a (truth-functional) logic, as it has to produce the same value for both of these expressions if A and B have the same truth value. This is why minimum and maximum are used.
Calling fuzzy logic “truth functional” sounds like you’re changing the semantics; but nobody really changes the semantics when they use these systems. Fuzzy logic use often becomes a semantic muddle, with people making the values simultaneously mean truth, probability, and measurement; interpreting them in an ad-hoc manner.
You can tell your truth-functional logic that A^A = A. Or, you can tell it that P(A|A) = 1, so that p(A^A) = p(A).
‘Truth functional’ means that the truth value of a sentence is a function of the truth values of the propositional variables within that sentence. Fuzzy logic works this way. Probability theory does not. It is not just that one is talking about degrees of truth and the other is talking about probabilities. The analogue to truth values in probability theory are probabilities, and the probability of a sentence is not a function of the probabilities of the variables that make up that sentence (as I pointed out in the preceding comment, when A and B have the same probability, but A^A has a different probability to A^B). Thus propositional fuzzy logic is inherently different to probability theory.
You might be able to create a version of ‘fuzzy logic’ in which it is non truth-functional, but then it wouldn’t really be fuzzy logic anymore. This would be like saying that there are versions of ‘mammal’ where fish are mammals, but we have to understand ‘mammal’ to mean what we normally mean by ‘animal’. Sure, you could reinterpret the terms in this way, but the people who created the terms don’t use them that way, and it just seems to be a distraction.
At least that is as far as I understand. I am not an expert on non-classical logic, but I’m pretty sure that fuzzy logic is always understood so as to be truth-functional.
Eep, maybe I should edit my post so it doesn’t say “fuzzy logic”. Not that I know that non-truth-functional fuzzy logic is a good idea; I simply don’t know that it isn’t.
I think I’ve figured it out.
You have a set of equations for p(X1), p(X2), etc., where
p(X1) = f1(p(X2), p(X3), … p(Xn))
p(X2) = f2(p(X1), p(X3), … p(Xn))
...
Warrigal is saying: This is a system of n equations in n unknowns. Solve it.
But this has nothing to do with whether you’re using fuzzy logic!
If you define the functions f1, f2, … so that each corresponds to something like
f1(p(X2), p(X3) , …) = p(X2 and (X3 or X4) … )
using standard probability theory, then you’re not using fuzzy logic. If you define them some other way, you’re using fuzzy logic. The approach described lets us find a consistent assignment of probabilities (or truth-values, if you prefer) either way.
Is this really the case?
In fuzzy logic, one requires that the real-numbered truth value of a sentence is a function of its constituents. This allows the “solve it” reply.
If we swap that for probability theory, we don’t have that anymore… instead, we’ve got the constraints imposed by probability theory. The real-numbered value of “A & B” is no longer a definite function F(val(A), val(B)).
Maybe this is only a trivial complication… but, I am not sure yet.
That is in fact precisely what I mean by “truth value”. What does “truth value” mean in your book?
Then what does it mean in fuzzy-logic to say “The truth value of ‘Bill is 3 feet tall’ is .5” ?
It means that Bill is pretty nearly 3 feet tall, but not exactly. Perhaps it means that for half of all practical purposes, Bill is 3 feet tall; that may be a good formalization.
I’ll mention now that I don’t know if normal treatments of fuzzy logic insists that truth value functions be continuous. Mine does, which may make it ordinary, substandard, quirky, or insightful.
I don’t understand this statement: “p(AvB) >= p(A^B), when Bayes’ law just blows up”.
p(AvB) >= p(A^B) should always be true shouldn’t it?
I know A^B → AvB is a tautology (p=1) and that the truth value of AvB → A^B depends on the values of A and B; when translated into probabilities show p(AvB) >= p(A^B) as true.
If you’re told p(A|B) = 1, but are given different values for p(A) and p(B), you can’t apply Bayes’ law. Something you’ve been told is wrong, but you don’t know what.
Note that the fuzzy logic rules given are a compromise between A and B having correlation 1, and being independent.