One part I’m not clear on is how the empirical knowledge works. The equivalent of “kilograms of mass” might be something like bits of Chaitin’s omega. If you have n bits of Chaitin’s omega, you can solve the halting problem for any Turing machine of length up to n. But, while you can get lower bounds on Chaitin’s omega by running Turing machines and seeing which halt, you can’t actually learn upper bounds on Chaitin’s omega except by observing uncomputable processes (for example, a halting oracle confirming that some Turing machine doesn’t halt). So unless your empirical knowledge is coming from an uncomputable source, you shouldn’t expect to gain any more bits of Chaitin’s omega.
In general, if we could recursively enumerate all non-halting Turing machines, then we could decide whether M halts by running M in parallel with a process that enumerates non-halting machines until finding M. If M halts, then we eventually find that it halts; if it doesn’t halt, then we eventually find that it doesn’t halt. So this recursive enumeration will give us an algorithm for the halting problem. I’m trying to understand how the things you’re saying could give us more powerful theories from empirical data without allowing us to recursively enumerate all non-halting Turing machines.
The equivalent of “kilograms of mass” might be something like bits of Chaitin’s omega. If you have n bits of Chaitin’s omega, you can solve the halting problem for any Turing machine of length up to n.
That is the domain of things which I am referring to, yes.
you can’t actually learn upper bounds on Chaitin’s omega except by observing uncomputable processes (for example, a halting oracle confirming that some Turing machine doesn’t halt).
This is where things get fuzzier. I think that by routing through logic and observing the empirical consequences of ordinal consistency axioms, we could learn to believe those axioms and thus eventually climb the ordinal hierarchy to gain knowledge of the halting behavior of increasingly many Turing machines. However, my original idea was much plainer: just do a kind of Solmonoff Induction over Turing machines encoding axiom sets.
I’m trying to understand how the things you’re saying could give us more powerful theories from empirical data without allowing us to recursively enumerate all non-halting Turing machines.
Even empirical data only allows us to enumerate a finite number of nonhalting Turing machines, without being able to decide the halting/nonhalting behavior of Turing machines too large for our empirical information.
So, you can compress a list of observations about which Turing machines halt by starting with a uniform prior over Chaitin’s omega. This can lead to quite a lot of compression: the information of whether the first n Turing machines halt consists of n bits, but only requires log(n) bits of Chaitin’s omega. If we saw whether more Turing machines halted, we would also uncover more bits of Chaitin’s omega. Is this the kind of thing you are thinking of?
I guess there’s another question of how any of this makes sense if the universe is computable. We can still use information about which Turing machines halt in part of our generative model for a computable universe, even though “x doesn’t halt” is never actually observed.
Perhaps you could make a statement like: Solomonoff induction wins on computable universes for the usual reason, and it doesn’t lose too many bits on uncomputable universes in some circumstances because it does at least as well as something that has a uniform prior over Chaitin’s omega.
I guess there’s another question of how any of this makes sense if the universe is computable. We can still use information about which Turing machines halt in part of our generative model for a computable universe, even though “x doesn’t halt” is never actually observed.
Think more like an inductive human being than a deductive formal system: “x doesn’t ever halt” would require proof, but “x really seems not to halt for a very long time and seems to contain an infinitely-growing recursion there” gives us very valid reason to believe “x doesn’t halt”, and guides us in trying to assemble a proof.
(Likewise, as a side note, it can actually be quite difficult to prove “x does halt (on all inputs)”, because merely being observed to halt on many inputs is no guarantee at all. Unless the looping/recursion is trivially structural on the input, proving “x does halt” to a type system’s termination checker can be quite difficult!)
Perhaps you could make a statement like: Solomonoff induction wins on computable universes for the usual reason, and it doesn’t lose too many bits on uncomputable universes in some circumstances because it does at least as well as something that has a uniform prior over Chaitin’s omega.
I think that considering “computable” or “uncomputable” to be ontological properties of universes is… contrary to the philosophical viewpoint being taken here? At least from my point of view here, you’re not breaking an Ontological Law of the Platonic Plane of Maths if you “decide the undecidable”; you’re breaking a law of thermodynamics (namely, you’re trying to act as if you possessed infinite entropy with which to compute). It’s physically impossible, not Platonically impossible.
If I may invoke a phrase, I’m basically taking the point of view that a formal system is a map of some mathematical territory (which Turing machines halt), but that it is possible to have increasingly detailed and accurate maps (systems with stronger axioms) of the same territory (increasing the quantity of Turing machines whose behavior we can decide). The hard nut to crack is: how do we acquire better maps?
Further, the universe doesn’t necessarily have to be computable in any ontological sense. To talk about the physical, empirical universe, we only need to make predictions about events which take place a finite period of time into the future. If we’re composing programs whose termination we want to guarantee, we can use inductive and coinductive type systems to guarantee termination and stepwise termination where we need them, and add hard time limits where we can’t be sure.
As Schmidhuber put it regarding Goedel Machines: there may be very rare situations in which some part of reality is just Platonically, ontologically incomputable, but a human being is bound by the same laws of entropy as anything else, so the Platonic, ontological level of things doesn’t really matter for purposes of doing real-world math or AI. (We find it convenient to pretend it does matter, but actually we only care about the theorems we can demonstrate to be true among real-world human beings (which includes Goedel Statements for Peano Arithmetic) rather than about Platonic Forms of Pure Randomness.)
Think more like an inductive human being than a deductive formal system: “x doesn’t ever halt” would require proof, but “x really seems not to halt for a very long time and seems to contain an infinitely-growing recursion there” gives us very valid reason to believe “x doesn’t halt”, and guides us in trying to assemble a proof.
Busy beavers seem not to halt for a very long time and spend most of their time in a seemingly infinitely-growing recursion, and then they halt.
You could argue that busy beavers are pathological examples of little practical interest, but then many programs that perform useful computation may run for a very long time without entering easily identifiable “loops” before they halt. Unless you designed the program yourself and have a good understanding of what it is doing, and sometimes even if you designed the program yourself, proving termination may not be trivial.
Unless you designed the program yourself and have a good understanding of what it is doing, and sometimes even if you designed the program yourself, proving termination may not be trivial.
If it was a trivial problem it wouldn’t be worth tackling.
My point is that the observation that a program has been running for a long time and doesn’t have simple “while(true) {...}” loops doesn’t give us much information about its termination.
But I’m not sure what you are aiming at. For most practical purposes, a program that takes a very long time before producing a result is as good as a program that will never produce a result. Termination is important in theoretical computer science as the prototypical undecidable property used to prove other non-existence results, but if you are concerned about specific programs that you plan to actually run on physical machines, why are you interested in termination?
This is where things get fuzzier. I think that by routing through logic and observing the empirical consequences of ordinal consistency axioms
I’m not sure I understand what you mean here.
Assume we have a formal system and we are using it to predict observations. In classical logic, if the formal system is inconsistent, then ex falso quodlibet, therefore all possible observations will be allowed by it.
I think the grandparent silently assumes we’ve solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion. Though “routing through logic” seems like an incredibly bad way to say that.
I think the grandparent silently assumes we’ve solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion.
I spoke imprecisely, or at least the OP argues that I did. It assumes we have a progressive solution to undecidability/ logical uncertainty that can take us pretty far, eventually, but does not allow self-knowledge at any time.
I think the grandparent silently assumes we’ve solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion.
No. We don’t decide an axiom does not produce a contradiction by observing the lack of explosion. We just raise the probability of noncontradiction by observing the lack of explosion.
EDIT: Axiom, not acronym. Holy God I need more sleep.
Though “routing through logic” seems like an incredibly bad way to say that.
-_-. The post itself was a sketch, not a finished, formalized publication. Please refrain from being anal about phrasings or imagining that I’ve done far more math than I really have. If I want to be precise, I’ll just use symbols—that’s what they’re for.
One part I’m not clear on is how the empirical knowledge works. The equivalent of “kilograms of mass” might be something like bits of Chaitin’s omega. If you have n bits of Chaitin’s omega, you can solve the halting problem for any Turing machine of length up to n. But, while you can get lower bounds on Chaitin’s omega by running Turing machines and seeing which halt, you can’t actually learn upper bounds on Chaitin’s omega except by observing uncomputable processes (for example, a halting oracle confirming that some Turing machine doesn’t halt). So unless your empirical knowledge is coming from an uncomputable source, you shouldn’t expect to gain any more bits of Chaitin’s omega.
In general, if we could recursively enumerate all non-halting Turing machines, then we could decide whether M halts by running M in parallel with a process that enumerates non-halting machines until finding M. If M halts, then we eventually find that it halts; if it doesn’t halt, then we eventually find that it doesn’t halt. So this recursive enumeration will give us an algorithm for the halting problem. I’m trying to understand how the things you’re saying could give us more powerful theories from empirical data without allowing us to recursively enumerate all non-halting Turing machines.
That is the domain of things which I am referring to, yes.
This is where things get fuzzier. I think that by routing through logic and observing the empirical consequences of ordinal consistency axioms, we could learn to believe those axioms and thus eventually climb the ordinal hierarchy to gain knowledge of the halting behavior of increasingly many Turing machines. However, my original idea was much plainer: just do a kind of Solmonoff Induction over Turing machines encoding axiom sets.
Even empirical data only allows us to enumerate a finite number of nonhalting Turing machines, without being able to decide the halting/nonhalting behavior of Turing machines too large for our empirical information.
So, you can compress a list of observations about which Turing machines halt by starting with a uniform prior over Chaitin’s omega. This can lead to quite a lot of compression: the information of whether the first n Turing machines halt consists of n bits, but only requires log(n) bits of Chaitin’s omega. If we saw whether more Turing machines halted, we would also uncover more bits of Chaitin’s omega. Is this the kind of thing you are thinking of?
I guess there’s another question of how any of this makes sense if the universe is computable. We can still use information about which Turing machines halt in part of our generative model for a computable universe, even though “x doesn’t halt” is never actually observed.
Perhaps you could make a statement like: Solomonoff induction wins on computable universes for the usual reason, and it doesn’t lose too many bits on uncomputable universes in some circumstances because it does at least as well as something that has a uniform prior over Chaitin’s omega.
Think more like an inductive human being than a deductive formal system: “x doesn’t ever halt” would require proof, but “x really seems not to halt for a very long time and seems to contain an infinitely-growing recursion there” gives us very valid reason to believe “x doesn’t halt”, and guides us in trying to assemble a proof.
(Likewise, as a side note, it can actually be quite difficult to prove “x does halt (on all inputs)”, because merely being observed to halt on many inputs is no guarantee at all. Unless the looping/recursion is trivially structural on the input, proving “x does halt” to a type system’s termination checker can be quite difficult!)
I think that considering “computable” or “uncomputable” to be ontological properties of universes is… contrary to the philosophical viewpoint being taken here? At least from my point of view here, you’re not breaking an Ontological Law of the Platonic Plane of Maths if you “decide the undecidable”; you’re breaking a law of thermodynamics (namely, you’re trying to act as if you possessed infinite entropy with which to compute). It’s physically impossible, not Platonically impossible.
If I may invoke a phrase, I’m basically taking the point of view that a formal system is a map of some mathematical territory (which Turing machines halt), but that it is possible to have increasingly detailed and accurate maps (systems with stronger axioms) of the same territory (increasing the quantity of Turing machines whose behavior we can decide). The hard nut to crack is: how do we acquire better maps?
Further, the universe doesn’t necessarily have to be computable in any ontological sense. To talk about the physical, empirical universe, we only need to make predictions about events which take place a finite period of time into the future. If we’re composing programs whose termination we want to guarantee, we can use inductive and coinductive type systems to guarantee termination and stepwise termination where we need them, and add hard time limits where we can’t be sure.
As Schmidhuber put it regarding Goedel Machines: there may be very rare situations in which some part of reality is just Platonically, ontologically incomputable, but a human being is bound by the same laws of entropy as anything else, so the Platonic, ontological level of things doesn’t really matter for purposes of doing real-world math or AI. (We find it convenient to pretend it does matter, but actually we only care about the theorems we can demonstrate to be true among real-world human beings (which includes Goedel Statements for Peano Arithmetic) rather than about Platonic Forms of Pure Randomness.)
Busy beavers seem not to halt for a very long time and spend most of their time in a seemingly infinitely-growing recursion, and then they halt.
You could argue that busy beavers are pathological examples of little practical interest, but then many programs that perform useful computation may run for a very long time without entering easily identifiable “loops” before they halt.
Unless you designed the program yourself and have a good understanding of what it is doing, and sometimes even if you designed the program yourself, proving termination may not be trivial.
If it was a trivial problem it wouldn’t be worth tackling.
My point is that the observation that a program has been running for a long time and doesn’t have simple “while(true) {...}” loops doesn’t give us much information about its termination.
But I’m not sure what you are aiming at.
For most practical purposes, a program that takes a very long time before producing a result is as good as a program that will never produce a result. Termination is important in theoretical computer science as the prototypical undecidable property used to prove other non-existence results, but if you are concerned about specific programs that you plan to actually run on physical machines, why are you interested in termination?
I’m not sure I understand what you mean here.
Assume we have a formal system and we are using it to predict observations. In classical logic, if the formal system is inconsistent, then ex falso quodlibet, therefore all possible observations will be allowed by it.
I think the grandparent silently assumes we’ve solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion. Though “routing through logic” seems like an incredibly bad way to say that.
Wouldn’t that imply having solved undecidability?
I spoke imprecisely, or at least the OP argues that I did. It assumes we have a progressive solution to undecidability/ logical uncertainty that can take us pretty far, eventually, but does not allow self-knowledge at any time.
No. We don’t decide an axiom does not produce a contradiction by observing the lack of explosion. We just raise the probability of noncontradiction by observing the lack of explosion.
EDIT: Axiom, not acronym. Holy God I need more sleep.
-_-. The post itself was a sketch, not a finished, formalized publication. Please refrain from being anal about phrasings or imagining that I’ve done far more math than I really have. If I want to be precise, I’ll just use symbols—that’s what they’re for.