Proofs in second-order logic work the exact same way as proofs in first-order logic: You prove a sentence by beginning with axioms, and construct the sentence using inference rules. In first-order logic, you have individuals, functions (from individuals to individuals), terms, and predicates (loosely speaking, functions from individuals to truth values), but may only quantify over individuals. In second-order logic, you may also quantify over predicates and functions. In third-order logic, you add functions of functions of individuals, and, in fourth-order logic, the ability to quantify over them. This continues all the way up to omega-eth order logic (a.k.a.: higher-order logic), where you may use functions of arbitrary orders. The axioms and inference rules are the same as in first-order logic.
Wait, I said nothing about sets above. Sets are no problem: a set containing certain elements is equivalent to a predicate which only returns true on those elements.
I also lied about it being the same as first-order logic—a couple are added. One very useful axiom scheme is comprehension, which roughly means that you can find a predicate equivalent to any formula. You can think of this as being an axiom schema of first-order logic, except that, since it cannot be stated until fourth-order logic, it contains no axioms in first-order logic.
(My formal logic training is primarily in Church’s theory of types, which is very similar to higher-order logic [though superficially extremely different]. I probably mixed-up terminology somewhere in the above.)
The comprehension axiom schema (or any other construction that can be used by a proof checker algorithm) isn’t enough to prove all the statements people consider to be inescapable consequences of second-order logic. You can view the system you described as a many-sorted first-order theory with sets as one of the sorts, and notice that it cannot prove its own consistency (which can be rephrased as a statement about the integers, or about a certain Turing machine not halting) for the usual Goedelian reasons. But we humans can imagine that the integers exist as “hunks of Platoplasm” somewhere within math, so the consistency statement feels obviously true to us.
It’s hard to say whether our intuitions are justified. But one thing we do know, provably, irrevocably and forever, is that being able to implement a proof checker algorithm precludes you from ever getting the claimed benefits of second-order logic, like having a unique model of the standard integers. Anyone who claims to transcend the reach of first-order logic in a way that’s relevant to AI is either deluding themselves, or has made a big breakthrough that I’d love to know about.
The comprehension axiom schema (or any other construction that can be used by a proof checker algorithm) isn’t enough to prove all the statements people consider to be inescapable consequences of second-order logic.
Indeed, since the second-order theory of the real numbers is categorical, and since it can express the continuum hypothesis, an oracle for second-order validity would tell us either that CH or ¬CH is ‘valid’.
If I understand correctly, you are saying that higher order logic cannot prove all theorems about the integers that ZFC can. That’s a very uninteresting statement. Since higher order logic was proven consistent* with ZFC, it is strictly weaker. Second order logic, is, of course, strictly weaker than 4th order logic, which is strictly weaker than 6th order logic, and so on, and is thus much weaker than higher-order logic.
I’ve never heard it claimed that second-order logic has a unique model of the standard integers. Actually, I’ve never heard of the standard integers—do you mean a unique standard model of the integers? I hope not, as we actually can get a unique standard model of the integers in a system with machine-checkable proofs. See “To Truth Through Proof” by Peter Andrews. It’s omitted in the Google Books preview, but I believe the proof is on page 327; i lent out my copy and cannot check. (It, naturally, does not have a unique general model, so this not of huge practical significance.)
* I have not actually read that, and should not really say that. However, type theory with an axiom of infinity, which is strictly stronger than type theory without an axiom of infinity, which can express all statements of higher-order logic, has been proven consistent. Also, any proof in higher order logic can be trivially converted into a proof of nth-order logic for some n, which can be shown consistent in (n+2)th-order logic.
If I understand correctly, you are saying that higher order logic cannot prove all theorems about the integers that ZFC can.
The argument wasn’t specific to ZFC or HOL, it was intended to apply to any system that can have a proof checker.
Actually, I’ve never heard of the standard integers—do you mean a unique standard model of the integers?
Sorry, I screwed up the wording there, meant to say simply “unique model of the integers”. People often talk about “standard integers” and “nonstandard integers” within models of a first-order axiom system, like PA or ZFC, hence my screwup. “Standard model” seems to be an unrelated concept.
I’ve never heard it claimed that second-order logic has a unique model of the standard integers.
The original post says something like that, search for “we know that we have only one (full) model”.
The argument wasn’t specific to ZFC or HOL, it was intended to apply to any system that can have a proof checker.
Pointing out the Gödelian limitations of all systems with recursively enumerable axioms doesn’t seem like much of criticism of the system of nth-order logic I mentioned. Now I have less of an idea of what you’re trying to say.
By the way, I think he’s using “full model” to mean “standard model.” Presumably, the standard integers are the standard model that satisfies the Peano axioms, while nonstandard integers are any other satisfying model (see http://en.wikipedia.org/wiki/Non-standard_model_of_arithmetic ).
Now I have less of an idea of what you’re trying to say.
Not much, admittedly :-) The interesting question to me is how to make a computer understand “integers” the way we seem to understand them. It must be possible because humans are not magical, but second-order logic doesn’t seem to help much.
If the human and the computer are using the same formal system, then sure. But how does a human arrive at the belief that some formal system, e.g. PA, is “really” talking about the integers, and some other system, e.g. PA+¬Con(PA), isn’t? Can we formalize the reasoning that leads us to such conclusions? Pointing out that PA can be embedded in some larger formal system, e.g. ZFC, doesn’t seem to be a good answer because that larger formal system will need to be justified in turn, leading to an infinite regress. And anyway humans don’t seem to be born with a belief in ZFC, so something else must be going on.
We form beliefs about mathematics the same way we form beliefs about everything else: heuristic-based learning algorithms. We typically accept things based on intuition and inductive inference until trained to rely on proof instead. There is nothing stopping a computer from forming mathematical beliefs based on statistical inference rather than logical inference.
Have a look at experimental mathematics or probabilistic number theory for some related material.
Saying “heuristic-based learning algorithms” doesn’t seem to compress probability mass very much. It feels like skipping over the mysterious part. How exactly do we write a program that would arrive at the axioms of PA by using statistics? I think if we did that, then the program probably wouldn’t stop at PA and would come up with many other interesting axioms, so it could be a useful breakthrough.
Yes, but the point is that we are learning features from empirical observations, not using some magic deduction system that our computers don’t have access to. That may only be one bit of information, but it’s a very important bit. This skips over the mysterious part in the exact same way that “electrical engineering” doesn’t answer “How does a CPU work?”—it tells you where to look to learn more.
I know far less about empirical mathematics than about logic. The only thing along these lines I’m familiar with is Douglas Lenat’s Automated Mathematician (which is only semi-automated). A quick search for “automated mathematician” on Google Scholar gives a lot of more recent work, including a 2002 book called “Automated theory formation in pure mathematics.”
But how does a human arrive at the belief that some formal system, e.g. PA, is “really” talking about the integers, and some other system, e.g. PA+¬Con(PA), isn’t? Can we formalize the reasoning that leads us to such conclusions?
Well, human beings are abductive and computational reasoners anyway. I think our mental representation of the natural numbers is much closer to being the domain-theoretic definition of the natural numbers as the least fixed point of a finite set of constructors.
Note how least fixed-points and abductive, computational reasoning fit together: a sensible complexity prior over computational models is going to assign the most probability mass to the simplest model, which is going to be the least-fixed-point model, which is the standard model (because nonstandard naturals will require additional bits of information to specify their constructors).
A similar procedure, but with a coinductive (greatest-fixed-point) definition that involves naturals as parameters to the data constructors, will give you the real numbers.
Proofs in second-order logic work the exact same way as proofs in first-order logic: You prove a sentence by beginning with axioms, and construct the sentence using inference rules. In first-order logic, you have individuals, functions (from individuals to individuals), terms, and predicates (loosely speaking, functions from individuals to truth values), but may only quantify over individuals. In second-order logic, you may also quantify over predicates and functions. In third-order logic, you add functions of functions of individuals, and, in fourth-order logic, the ability to quantify over them. This continues all the way up to omega-eth order logic (a.k.a.: higher-order logic), where you may use functions of arbitrary orders. The axioms and inference rules are the same as in first-order logic.
Wait, I said nothing about sets above. Sets are no problem: a set containing certain elements is equivalent to a predicate which only returns true on those elements.
I also lied about it being the same as first-order logic—a couple are added. One very useful axiom scheme is comprehension, which roughly means that you can find a predicate equivalent to any formula. You can think of this as being an axiom schema of first-order logic, except that, since it cannot be stated until fourth-order logic, it contains no axioms in first-order logic.
(My formal logic training is primarily in Church’s theory of types, which is very similar to higher-order logic [though superficially extremely different]. I probably mixed-up terminology somewhere in the above.)
The comprehension axiom schema (or any other construction that can be used by a proof checker algorithm) isn’t enough to prove all the statements people consider to be inescapable consequences of second-order logic. You can view the system you described as a many-sorted first-order theory with sets as one of the sorts, and notice that it cannot prove its own consistency (which can be rephrased as a statement about the integers, or about a certain Turing machine not halting) for the usual Goedelian reasons. But we humans can imagine that the integers exist as “hunks of Platoplasm” somewhere within math, so the consistency statement feels obviously true to us.
It’s hard to say whether our intuitions are justified. But one thing we do know, provably, irrevocably and forever, is that being able to implement a proof checker algorithm precludes you from ever getting the claimed benefits of second-order logic, like having a unique model of the standard integers. Anyone who claims to transcend the reach of first-order logic in a way that’s relevant to AI is either deluding themselves, or has made a big breakthrough that I’d love to know about.
Indeed, since the second-order theory of the real numbers is categorical, and since it can express the continuum hypothesis, an oracle for second-order validity would tell us either that CH or ¬CH is ‘valid’.
(“Set theory in sheep’s clothing”.)
If I understand correctly, you are saying that higher order logic cannot prove all theorems about the integers that ZFC can. That’s a very uninteresting statement. Since higher order logic was proven consistent* with ZFC, it is strictly weaker. Second order logic, is, of course, strictly weaker than 4th order logic, which is strictly weaker than 6th order logic, and so on, and is thus much weaker than higher-order logic.
I’ve never heard it claimed that second-order logic has a unique model of the standard integers. Actually, I’ve never heard of the standard integers—do you mean a unique standard model of the integers? I hope not, as we actually can get a unique standard model of the integers in a system with machine-checkable proofs. See “To Truth Through Proof” by Peter Andrews. It’s omitted in the Google Books preview, but I believe the proof is on page 327; i lent out my copy and cannot check. (It, naturally, does not have a unique general model, so this not of huge practical significance.)
* I have not actually read that, and should not really say that. However, type theory with an axiom of infinity, which is strictly stronger than type theory without an axiom of infinity, which can express all statements of higher-order logic, has been proven consistent. Also, any proof in higher order logic can be trivially converted into a proof of nth-order logic for some n, which can be shown consistent in (n+2)th-order logic.
We don’t seem to understand each other yet :-(
The argument wasn’t specific to ZFC or HOL, it was intended to apply to any system that can have a proof checker.
Sorry, I screwed up the wording there, meant to say simply “unique model of the integers”. People often talk about “standard integers” and “nonstandard integers” within models of a first-order axiom system, like PA or ZFC, hence my screwup. “Standard model” seems to be an unrelated concept.
The original post says something like that, search for “we know that we have only one (full) model”.
Pointing out the Gödelian limitations of all systems with recursively enumerable axioms doesn’t seem like much of criticism of the system of nth-order logic I mentioned. Now I have less of an idea of what you’re trying to say.
By the way, I think he’s using “full model” to mean “standard model.” Presumably, the standard integers are the standard model that satisfies the Peano axioms, while nonstandard integers are any other satisfying model (see http://en.wikipedia.org/wiki/Non-standard_model_of_arithmetic ).
Not much, admittedly :-) The interesting question to me is how to make a computer understand “integers” the way we seem to understand them. It must be possible because humans are not magical, but second-order logic doesn’t seem to help much.
Computers can prove everything about integers that we can. I don’t see a problem here.
If the human and the computer are using the same formal system, then sure. But how does a human arrive at the belief that some formal system, e.g. PA, is “really” talking about the integers, and some other system, e.g. PA+¬Con(PA), isn’t? Can we formalize the reasoning that leads us to such conclusions? Pointing out that PA can be embedded in some larger formal system, e.g. ZFC, doesn’t seem to be a good answer because that larger formal system will need to be justified in turn, leading to an infinite regress. And anyway humans don’t seem to be born with a belief in ZFC, so something else must be going on.
We form beliefs about mathematics the same way we form beliefs about everything else: heuristic-based learning algorithms. We typically accept things based on intuition and inductive inference until trained to rely on proof instead. There is nothing stopping a computer from forming mathematical beliefs based on statistical inference rather than logical inference.
Have a look at experimental mathematics or probabilistic number theory for some related material.
Saying “heuristic-based learning algorithms” doesn’t seem to compress probability mass very much. It feels like skipping over the mysterious part. How exactly do we write a program that would arrive at the axioms of PA by using statistics? I think if we did that, then the program probably wouldn’t stop at PA and would come up with many other interesting axioms, so it could be a useful breakthrough.
Yes, but the point is that we are learning features from empirical observations, not using some magic deduction system that our computers don’t have access to. That may only be one bit of information, but it’s a very important bit. This skips over the mysterious part in the exact same way that “electrical engineering” doesn’t answer “How does a CPU work?”—it tells you where to look to learn more.
I know far less about empirical mathematics than about logic. The only thing along these lines I’m familiar with is Douglas Lenat’s Automated Mathematician (which is only semi-automated). A quick search for “automated mathematician” on Google Scholar gives a lot of more recent work, including a 2002 book called “Automated theory formation in pure mathematics.”
Well, human beings are abductive and computational reasoners anyway. I think our mental representation of the natural numbers is much closer to being the domain-theoretic definition of the natural numbers as the least fixed point of a finite set of constructors.
Note how least fixed-points and abductive, computational reasoning fit together: a sensible complexity prior over computational models is going to assign the most probability mass to the simplest model, which is going to be the least-fixed-point model, which is the standard model (because nonstandard naturals will require additional bits of information to specify their constructors).
A similar procedure, but with a coinductive (greatest-fixed-point) definition that involves naturals as parameters to the data constructors, will give you the real numbers.