as long as you have a set of all sets and the ability to take subsets specified by properties, you get Russell’s paradox.
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by the minimum amount required to get rid of paradoxes. Except if you squint at it the right way, it won’t even look like a restriction :-)
I will use the words “set” and “predicate” interchangeably. A predicate is a function that returns True or False. (Of course it doesn’t have to be Turing-computable or anything.) It’s pretty clear that some predicates exist, e.g. the predicate that always returns False (the empty set) or the one that always returns True (the set of all sets). This seems like a tiny change of terminology, but to me it seems enough to banish Russell’s paradox!
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
To make this a little more precise: I think naive set theory can be made to work by disallowing predicates, like the Russell predicate, that are “incompletely defined” in the above sense. In this new theory we will have “AFA-like” non-well-founded sets (e.g. the Quine atom Q={Q}), and so we will need to define equality through bisimilarity. And that’s pretty much all.
As you can see, this is really basic stuff. There’s got to be some big idiotic mistake in my thinking—some kind of contradiction in this new notion of “set”—but I haven’t found it yet.
EDITED on May 13 2010: I’ve found a contradiction. You can safely disregard my theory.
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by >the minimum amount required to get rid of paradoxes.
Well, others have had this same idea. The standard example of a set theory built along those lines is Quine’s “New Foundations” or “NF”.
Now, Russell’s paradox arises when we try to work within a set theory that allows ‘unrestricted class comprehension’. That means that for any predicate P expressed in the language of set theory, there exists a set whose elements are all and only the sets with property P, which we denote {x : P(x) }
In ZF we restrict class comprehension by only assuming the existence of things of the form { x in Y : P(x)} and { f(x) : x in Y } (these correspond respectively to the Axiom of Separation and the Axiom of Replacement ).
On the other hand, in NF we grant existence to anything of the form { x : P(x) } as long as P is what’s called a “stratified” predicate. To say a predicate is stratified is to say that one can assign integer-valued “levels” to the variables in such a way that for any subexpression of the form “x is in y” y’s level has to be one greater than x’s level.
Then clearly the predicate “P(x) iff x is in x” fails to be stratified (because x’s level can’t be one greater than itself). However, the predicate “P(x) iff x = x” is obviously stratified, and {x : x = x} is the set of all sets.
I know New Foundations, but stratification is too strong a restriction for my needs. This weird set theory of mine actually arose from a practical application—modeling “metastrategies” in the Prisoner’s Dilemma. See this thread on decision-theory-workshop.
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
How is it that the paradox “goes away”? If you “additionally specify R(R) to be True or False”, don’t you just go down one or the other of the two cases in Russell’s paradox?
Suppose we decide to specify that R(R) is true. Then, by your definition, not(R(R)) is true. That means that R(R) is false, contrary to our specification. Similarly, if we instead specify that R(R) is false, we are led to conclude that R(R) is true, again contradicting our specification.
The conclusion is that we can’t specify any truth value for R(R). Either truth value leads to a contradiction, so R(R) must be left undefined. Is that what you mean to say?
Suppose we decide to specify that R(R) is true. Then, by your definition not(R(R)) is true.
No, in this case R(X) = not(X(X)) for all X distinct from R, and additionally R(R) is true. This is a perfectly fine, completely defined, non-self-contradictory predicate.
Okay, I see. I see nothing obviously contradictory with this.
From a technical standpoint, the hard part would be to give a useful criterion for when a seemingly-well-formed string does or does not completely define a predicate. The string not(X(X)) seems to be well-formed, but you’re saying that actually it’s just a fragment of a predicate, because you need to add “for X not equal to this predicate”, and then give an addition clause about whether this predicate satisfies itself, to have a completely-defined predicate.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
No, AFA and similar systems are different. They have no “set of all sets” and still make you construct sets up from their parts, but they give you more parts to play with: e.g. explicitly convert a directed graph with cycles into a set that contains itself.
I didn’t mean that what you propose to do is commensurate with those systems. I just meant that those systems might have addressed the technical issue that I pointed out, but it’s not yet clear to me how you address this issue.
I can’t say anything about this specific construction, but there is a related issue in Turing machines. The issue was whether you could determine a useful subset S of the set of all Turing machines, such that the halting problem is solveable for all machines in S, and S was general enough to contain useful examples.
If I remember correctly, the answer was that you couldn’t. This feels a lot like that—I’d bet that the only way of being sure that we can avoid Russel’s paradox is to restrict predicates to such a narrow category that we can’t do much anything useful with them.
I think you are going to run into serious problems. Consider the predicate that always returns true. Then if I’m following Russell’s original formulation of the paradox involving the powerset of the set of all sets will still lead to a contradiction.
Original form of Russell’s paradox: Let A be the set of all sets and let P(A) be the powerset of A. By Cantor, |P(A)| > |A|. But, P(A) is a subset of A, so |P(A)|<=|A|. That’s a contradiction.
Cantor’s theorem breaks down in my system when applied to the set of all sets, because its proof essentially relies on Russell’s paradox to reach the contradiction.
Hmm, that almost seems to be cutting off the nose to spite the cliche. Cantor’s construction is a very natural construction. A set theory where you can’t prove that would be seen by many as unacceptably weak. I’m a bit fuzzy on the details of your system, but let me ask, can you prove in this system that there’s any uncountable set at all? For example, can we prove |R| > |N| ?
Yes. The proof that |R| > |N| stays working because predicates over N aren’t themselves members of N, so the issue of “complete definedness” doesn’t come up.
Hmm, this make work then and not kill off too much of set theory. You may want to talk to a professional set theorist or logician about this (my specialty is number theory so all I can do is glance at this and say that it looks plausible). The only remaining issue then becomes that I’m not sure that this is inherently better than standard set theory. In particular, this approach seems much less counterintuitive than ZFC. But that may be due to the fact that I’m more used to working with ZF-like objects.
Ok, I’ve read up on Cantor’s theorem now, and I think the trick is in the types of A and P(A), and the solution to the paradox is to borrow a trick from type theory. A is defined as the set of all sets, so the obvious question is, sets of what key type? Let that key type be t. Then
A: t=>bool
P(A): (t=>bool)=>bool
We defined P(A) to be in A, so a t=>bool is also a t. Let all other possible types for t be T. t=(t=>bool)+T. Now, one common way to deal with recursive types like this is to treat them as the limit of a sequence of types:
t[i] = t[i-1]=>bool + T
A[i]: t[i]=>bool
P(A[i]) = A[i+1]
Then when we take the limit,
t = lim i->inf t[i]
A = lim i->inf A[i]
P(A) = lim i->inf P(A[i])
Then suddenly, paradoxes based on the cardinality of A and P(A) go away, because those cardinalities diverge!
I’m not sure I know enough about type theory to evaluate this. Although I do know that Russell’s original attempts to repair the defect involved type theory (Principia Mathematica uses a form of type theory however in that form one still can’t form the set of all sets). I don’t think the above works but I don’t quite see what’s wrong with it. Maybe Sniffnoy or someone else more versed in these matters can comment.
I don’t know anything about type theory; when I wrote that I heard it has philosophical problems when applied to set theory, I meant I heard that from you. What the problems might actually be was my own guess...
I’m not deeply familiar with set theory, but cousin_it’s formulation looks valid to me. Isn’t the powerset of the set of all sets just the set of all sets of sets? (Or equivalently, the predicate X=>Y=>Z=>true.) How would you use that to reconstruct the paradox in a way that couldn’t be resolved in the same way?
The powerset of the set of all sets may or may not be the set of all sets (it depends on whether or not you accept atoms in your version of set theory). However, Cantor’s theorem shows that for any set B, the power set of B has cardinality strictly larger than B. So if B=P(B) you’ve got a problem.
Yes, that’s true. What I have in mind is restricting the latter ability a bit, by the minimum amount required to get rid of paradoxes. Except if you squint at it the right way, it won’t even look like a restriction :-)
I will use the words “set” and “predicate” interchangeably. A predicate is a function that returns True or False. (Of course it doesn’t have to be Turing-computable or anything.) It’s pretty clear that some predicates exist, e.g. the predicate that always returns False (the empty set) or the one that always returns True (the set of all sets). This seems like a tiny change of terminology, but to me it seems enough to banish Russell’s paradox!
Let’s see how it works. We try to define the Russell predicate R thusly:
R(X) = not(X(X))
...and fail. This definition is incomplete. The value of R isn’t defined on all predicates, because we haven’t specified R(R) and can’t compute it from the definition. If we additionally specify R(R) to be True or False, the paradox goes away.
To make this a little more precise: I think naive set theory can be made to work by disallowing predicates, like the Russell predicate, that are “incompletely defined” in the above sense. In this new theory we will have “AFA-like” non-well-founded sets (e.g. the Quine atom Q={Q}), and so we will need to define equality through bisimilarity. And that’s pretty much all.
As you can see, this is really basic stuff. There’s got to be some big idiotic mistake in my thinking—some kind of contradiction in this new notion of “set”—but I haven’t found it yet.
EDITED on May 13 2010: I’ve found a contradiction. You can safely disregard my theory.
Well, others have had this same idea. The standard example of a set theory built along those lines is Quine’s “New Foundations” or “NF”.
Now, Russell’s paradox arises when we try to work within a set theory that allows ‘unrestricted class comprehension’. That means that for any predicate P expressed in the language of set theory, there exists a set whose elements are all and only the sets with property P, which we denote {x : P(x) }
In ZF we restrict class comprehension by only assuming the existence of things of the form { x in Y : P(x)} and { f(x) : x in Y } (these correspond respectively to the Axiom of Separation and the Axiom of Replacement ).
On the other hand, in NF we grant existence to anything of the form { x : P(x) } as long as P is what’s called a “stratified” predicate. To say a predicate is stratified is to say that one can assign integer-valued “levels” to the variables in such a way that for any subexpression of the form “x is in y” y’s level has to be one greater than x’s level.
Then clearly the predicate “P(x) iff x is in x” fails to be stratified (because x’s level can’t be one greater than itself). However, the predicate “P(x) iff x = x” is obviously stratified, and {x : x = x} is the set of all sets.
I know New Foundations, but stratification is too strong a restriction for my needs. This weird set theory of mine actually arose from a practical application—modeling “metastrategies” in the Prisoner’s Dilemma. See this thread on decision-theory-workshop.
How is it that the paradox “goes away”? If you “additionally specify R(R) to be True or False”, don’t you just go down one or the other of the two cases in Russell’s paradox?
Suppose we decide to specify that R(R) is true. Then, by your definition, not(R(R)) is true. That means that R(R) is false, contrary to our specification. Similarly, if we instead specify that R(R) is false, we are led to conclude that R(R) is true, again contradicting our specification.
The conclusion is that we can’t specify any truth value for R(R). Either truth value leads to a contradiction, so R(R) must be left undefined. Is that what you mean to say?
No, in this case R(X) = not(X(X)) for all X distinct from R, and additionally R(R) is true. This is a perfectly fine, completely defined, non-self-contradictory predicate.
Why is R(X) = not(X(X)) only for R =/= X? In Russell’s version, X should vary over all predicates/sets, meaning when instance X with R, we get
R(R) = ¬R(R)
as per the paradox.
Not sure what your objection is. I introduced the notion of “incompletely defined predicate” to do away with Russell’s version of the predicate.
Okay, I see. I see nothing obviously contradictory with this.
From a technical standpoint, the hard part would be to give a useful criterion for when a seemingly-well-formed string does or does not completely define a predicate. The string not(X(X)) seems to be well-formed, but you’re saying that actually it’s just a fragment of a predicate, because you need to add “for X not equal to this predicate”, and then give an addition clause about whether this predicate satisfies itself, to have a completely-defined predicate.
I guess that this was the sort of work that was done in these non-foundational systems that people are talking about.
No, AFA and similar systems are different. They have no “set of all sets” and still make you construct sets up from their parts, but they give you more parts to play with: e.g. explicitly convert a directed graph with cycles into a set that contains itself.
I didn’t mean that what you propose to do is commensurate with those systems. I just meant that those systems might have addressed the technical issue that I pointed out, but it’s not yet clear to me how you address this issue.
I can’t say anything about this specific construction, but there is a related issue in Turing machines. The issue was whether you could determine a useful subset S of the set of all Turing machines, such that the halting problem is solveable for all machines in S, and S was general enough to contain useful examples.
If I remember correctly, the answer was that you couldn’t. This feels a lot like that—I’d bet that the only way of being sure that we can avoid Russel’s paradox is to restrict predicates to such a narrow category that we can’t do much anything useful with them.
I think you are going to run into serious problems. Consider the predicate that always returns true. Then if I’m following Russell’s original formulation of the paradox involving the powerset of the set of all sets will still lead to a contradiction.
I can’t seem to work out for myself what you mean. Can you spell it out in more detail?
Original form of Russell’s paradox: Let A be the set of all sets and let P(A) be the powerset of A. By Cantor, |P(A)| > |A|. But, P(A) is a subset of A, so |P(A)|<=|A|. That’s a contradiction.
Cantor’s theorem breaks down in my system when applied to the set of all sets, because its proof essentially relies on Russell’s paradox to reach the contradiction.
Hmm, that almost seems to be cutting off the nose to spite the cliche. Cantor’s construction is a very natural construction. A set theory where you can’t prove that would be seen by many as unacceptably weak. I’m a bit fuzzy on the details of your system, but let me ask, can you prove in this system that there’s any uncountable set at all? For example, can we prove |R| > |N| ?
Yes. The proof that |R| > |N| stays working because predicates over N aren’t themselves members of N, so the issue of “complete definedness” doesn’t come up.
Hmm, this make work then and not kill off too much of set theory. You may want to talk to a professional set theorist or logician about this (my specialty is number theory so all I can do is glance at this and say that it looks plausible). The only remaining issue then becomes that I’m not sure that this is inherently better than standard set theory. In particular, this approach seems much less counterintuitive than ZFC. But that may be due to the fact that I’m more used to working with ZF-like objects.
The original form of Russell’s (Zermelo’s in fact) paradox is not this. The original form is {x|x not member of x}.
That leads to both
x is a member of x
and
x is not a member of x
And that is the original form of the paradox.
No. See for example This discussion. The form you give where it is described as a simple predicate recursion was not the original form of the paradox.
Ok, I’ve read up on Cantor’s theorem now, and I think the trick is in the types of A and P(A), and the solution to the paradox is to borrow a trick from type theory. A is defined as the set of all sets, so the obvious question is, sets of what key type? Let that key type be t. Then
We defined P(A) to be in A, so a t=>bool is also a t. Let all other possible types for t be T. t=(t=>bool)+T. Now, one common way to deal with recursive types like this is to treat them as the limit of a sequence of types:
Then when we take the limit,
Then suddenly, paradoxes based on the cardinality of A and P(A) go away, because those cardinalities diverge!
I’m not sure I know enough about type theory to evaluate this. Although I do know that Russell’s original attempts to repair the defect involved type theory (Principia Mathematica uses a form of type theory however in that form one still can’t form the set of all sets). I don’t think the above works but I don’t quite see what’s wrong with it. Maybe Sniffnoy or someone else more versed in these matters can comment.
I don’t know anything about type theory; when I wrote that I heard it has philosophical problems when applied to set theory, I meant I heard that from you. What the problems might actually be was my own guess...
Huh. Did I say that? I don’t know almost anything about type theory. When did I say that?
I’m not deeply familiar with set theory, but cousin_it’s formulation looks valid to me. Isn’t the powerset of the set of all sets just the set of all sets of sets? (Or equivalently, the predicate X=>Y=>Z=>true.) How would you use that to reconstruct the paradox in a way that couldn’t be resolved in the same way?
The powerset of the set of all sets may or may not be the set of all sets (it depends on whether or not you accept atoms in your version of set theory). However, Cantor’s theorem shows that for any set B, the power set of B has cardinality strictly larger than B. So if B=P(B) you’ve got a problem.