Why you can’t treat decidability and complexity as a constant (Post #1)
Or, why you need to fix a machine before you can prove anything, and you also need to fix the constraints on the machine. This also holds importantly for problems claimed to be decidable or undecidable by algorithms.
Basically, the reason here is that different machine/computational models define different sets of computable functions, and you cannot treat the machines as equivalent in power.
This is admittedly something that most people get implicitly today, but it can definitely cause problems, and it certainly caused a lot of problems for Church and Turing et al in that they incorrectly jumped to the conclusion that the Turing Machines could compute any possible computer, or compute any possible algorithm, probably because they thought that you could treat a Turing Machine as equivalent to any other machine. Why the Church-Turing thesis is false, in the sense that it doesn’t apply universally will be covered in an upcoming post in this sequence, but for now take it as a fact that there are different models of computation that define different computable sets, or equivalently different problems become decidable when new models are added.
Edit: I’m focusing on a variant of the thesis for the next post, in which I focus not on what’s possible given our mathematical structure of reality, but whether it’s possible to exceed the Turing Machine at all in any possible mathematical structure, and another variant where we restrict it to Turing-equivalent models, but we can arbitrarily change what we can offer the Turing Machine.
This is important, since I’m mostly going in a philosophical, not practical direction, and the thesis made no reference to our specific mathematical reality at all, so it’s important.
From wikipedia:
“In computability theory, the Church–Turing thesis (also known as computability thesis,[1] the Turing–Church thesis,[2] the Church–Turing conjecture, Church’s thesis, Church’s conjecture, and Turing’s thesis) is a thesis about the nature of computable functions. It states that a function on the natural numbers can be calculated by an effective method if and only if it is computable by a Turing machine. ”
Link is below:
https://en.wikipedia.org/wiki/Church–Turing_thesis
The most important takeaway from this sequence is that it really does matter what computer/machine model you use, and what constraints you add to the model, and Turing Machines aren’t always the best model to use, and that you shouldn’t be eliding this in a proof of something as an informal statement, as otherwise it maps to different computable sets, which you don’t want to do in a proof.
The best example of this phenomenon here is the following below:
The difference between a TM with a Deutschian CTC, and a TM without a Deutschian CTC in what problems they can solve (Not what the machines can simulate!)
It’s been proven by Scott Aaronson, Mohammad Bavarian, and Giulio Gueltrini below that the set of problems a Turing Machine could solve with unbounded time and memory with a Deutschian CTC is larger than the set of problems that could be solved without a Deutschian CTC by a Turing Machine with unlimited time and memory.
Specifically, the set of problems solvable with a Deutschian CTC includes the halt function, as well as the set of problems that are Turing-reducible to the halting problem.
It’s a classic halting oracle, but now it’s constructed and visualized.
The link is below, but the main operators here are essentially a new lemma, plus some theorems that establish that in the setting of Turing Machines with CTCs both contains the halt oracle, plus is contained in the halt oracle.
https://www.scottaaronson.com/papers/ctchalt.pdf
The importance of the result
The importance of this result is that you can’t arbitrarily change a Turing Machine, and then define the same sets of computable functions. It gives motivation to why it’s important to distinguish different intuitive/formal machine models, and why you need to fix the machine model, as well as constraints on that model, so that you can talk about what problems are decidable or intractable, so you don’t end up in the trap of Church and Turing et al, who thought you could capture all of computation as a Turing Machine.
In the next post, I’ll talk about Hilbert’s triumph, Church and Turing’s failure and why it turned out that there are machines that can decide first order logic, which is a huge chunk of math.
An aside on CTC Turing Machines, AIXI, and Solonomoff Induction et al
While I will probably make a post on this sequence about comparing what different computer models can do, it is worth noting that AIXI and Solonomoff Inductors can actually be weaker or stronger than the CTC Turing Machines, depending on whether you allow them to solve all problems that are only many-one reducible to the halting problem, or allow them to solve all problems that are Turing-reducible to the halting problem, or allow them to solve problems that are higher in the arithmetical hierarchy. I just want to include this, since depending on how much power you assume AIXI or Solonomoff Inductors have, you can get very different results.
Major edit on the Church-Turing Thesis
For the purposes of the Church-Turing Thesis, I will define the claim by Church and Turing like this, to translate what I think they wanted to conjecture.
The Church-Turing Thesis/Conjecture is captured by the first point below:
-
The Universal Turing Machine, or equivalently a human with unlimited time and memory can compute all the functions that are calculable by effective functions, as defined below.
-
Effectively calculable functions have the following constraints/rules:
2.1 They must always give an answer in finite time, but the time bound can be arbitrarily large, and the space required to solve a problem may be unbounded, but never infinite, just arbitrarily large.
2.2 The human is assumed to have unlimited paper to act as a memory, and unlimited time to compute (They are immortal.)
2.3 The original condition has been replaced by the following condition below from Rafael Harth:
In this setting, we call a problem decidable iff there exists a single, fixed procedure that can take arbitrary instances of this problem and always behaves like this:
If the correct answer is ‘yes’, it outputs ‘yes’. If the correct answer is ‘no’, it outputs ‘no’.
The original statement is included below for completeness:
Its instructions need only to be followed rigorously to succeed. In other words, it requires no ingenuity to succeed.[4]
2.4 This condition is quoted below:
It consists of a finite number of exact, finite instructions.
2.5 Arbitrary extensions to the UTM are valid as long as they obey the constraints above.
2.6 Alternatively, any algorithm or effective method as defined in this post is allowed to be used so long as it is simulatable by a UTM, even multiple UTMs are allowed to simulate something.
Edit: 2.5 is essentially the classical Church-Turing thesis, while 2.6 is not the proper Church-Turing thesis, properly speaking, it’s instead a useful modification.
The question becomes: Does the constraints on the human or UTM always lead to the same set, or can it lead to a larger set of computable functions? We will prove later on that it can lead to a larger set of computable functions.
We will then prove in a later post that conditions 2.5 or 2.6 will lead to a larger set of computable functions than a UTM can compute, and thus the Church-Turing Thesis is false. However, by dropping both conditions, and by adding a new condition by hand that restricts what we can do with effective functions, it can be true.
Something that might help in understanding decidability
An important snippet from Rafael Harth is generally useful and doesn’t require much in the way of time investment.
When mathematicians talk about a ‘problem’ in this context, they always mean “an infinite set of problems such that (a) the answer to each problem is either ‘yes’ or ‘no’, and (2) the set contains arbitrarily large instances of problems”. Here are two examples:
Given an arbitrarily large natural number, decide if it is a prime number.
Given an arbitrarily long program code, decide if the program, if run, eventually halts.
The Church-Turing thesis is about what things can be effectively computed in reality. Obviously you can define hypothetical machines that exceed the Turing machine, and this was already known to Church and Turing. CTC computers that can solve the halting problem don’t disprove the CTT unless they can actually be built.
Note that the original thesis made no reference to our specific reality at all, as it was instead defined on effective functions.
Thus I can use essentially arbitrary changes to the model, and I can still break the thesis if there’s any extension of a Turing Machine that let’s you compute more things.
Right now, I’m only trying to determine possibility, in a philosophical sense, not something that I know could be done.
If I recall correctly, “effective” is referring to a method which can exist in our reality. I’d be suprised if you could find a quote from the original papers suggesting otherwise.
I went to the wikipedia page, and the notion of effective functions was defined like this:
Critically, nothing about the specific constraints of reality were ever stated in the problem. Also, the human.
Here’s the article where I got it from.
https://en.wikipedia.org/wiki/Effective_method
Before I reply to this, I’d like to note that if your point is that the limits of effective functions depend on what kind of universe the functions are executed in, then I claimn that people on LW accept that thesis already. But we don’t accept that the extended church-turing is wrong. A plurality of LW users would say It is probably right. And what we mean by that is that any process we care about in our world can be simulated accurately enough for our purposes by a recursively computable function.
As far as I can tell, this is the what most people mean by the extended-church-turing thesis. Heck, that’s what Turing meant by it. Or, at least, he didn’t mean what I think you claim he meant. He was the first to investigate machines which could compute the undecidable. He calld them O-machines in his thesis. And if Wikipedia gives a different definition, then so much the worse for Wikipedia!
I don’t think we actually disagree here about the limits of computation. I think we agree that what we can only compute recursive functions in our universe, and that in other universes this could easily be different. Rather, we just over what things like “effective functions” mean, or what chuch-turing means etc. And I suspect that’s why other people were downvoting your post. They probably read your psot and thought you didn’t understand basic computability theory. But to me, it looks like you do understand computability theory but are simply using some terms incorrectly. The disagreement is purely semantic.
If you re-wrote your post in light of this, I suspect it would fair much better here.
That was definitely most of the disagreement, as I remembered a different version of the Church-Turing Thesis than some other people had, and I definitely had different ideas of what effective functions meant to people (I didn’t anticipate people assuming constraints that weren’t in the problem on Wikipedia.)
My fundamental claim is that the Church-Turing Thesis is not valid, to use terms for logic, that is it doesn’t hold for every model.
I’d probably have a post on what we can actually compute later on, but I mostly agree that with some caveats, as is anything pertaining to the long-term future, the set of functions we can compute is smaller than the set of computable functions as traditionally defined.
But yes, I will rewrite this post, mostly to make it clear what I’m arguing against.
What he didn’t realize, apparently, is that this makes the claim, as I stated it, essentially invalid, but it’s still satisfiable, since there exist models where the Church-Turing Thesis is true. I think the biggest disagreement was whether the Church-Turing Thesis was an existential claim about computers, or a universal claim about computers, and also a disagreement about which constraints are assumed into the problem, but also an imprecision over which models are allowed or reasonable.
And the extended version again has a problem with the use of the word “reasonable” in it’s definition, and IMO this is a pretty big problem, when people use the word reasonable to motte-and-bailey criticisms of the thesis.
Nevertheless, I will rewrite this post, and thanks for commenting.
I made a major rewrite of this post, in that I defined what Church and Turing were likely talking about, and importantly defined what the effective methods could do and what constraints were placed on them, or alternatively what the algorithm must do, and what constraints on the algorithms must follow.
Critically, while algorithm running times and memory/space are always finite, they are not bounded in the time or memory/space.
Also, it is universally quantified, as it includes something close to a for-all condition, rather than existentially quantified.
It seems to me that the constraints of reality are implicit. I don’t think “it can be done by a human” is satisfied by a method requiring time travel with a very specific form of paradox resolution. It sounds like you’re arguing that the Church-Turing thesis is simply worded ambiguously.
I definitely remember a different Church-Turing Thesis than what you said, and if we kept to realistic limitations of humans, then the set of computable functions is way, way smaller than the traditional view, so that’s not really much of a win at all. At any rate, it is usable by a human, mostly because humans are more specific, and critically a Universal Turing Machine can perfectly emulate a human brain, so if we accept the CTC model as valid for Turing Machines, then we need to accept it’s validity for humans, since humans are probably just yet another form of Turing Machine.
I will edit the post though, to make it clear what I’m talking about.
I think you are overclaiming here. I’m not a philosopher of computation, but it seems fair to say that the Church-Turing thesis refers to the specific model gestured at by the simplified model of reality where you allow unbounded space and time usage, and no edge cases of physics or quantum phenomena etc. Similarly, the natural numbers as such don’t depend on any mathematical model, but they are what you get if you “just keep counting”.
There’s a new version of it here that I post for completeness, but I’ll respond to the old one below:
New quote:
The problem is that given the assumption and my conditions, it is easy to show that under condition 2.6 that I can make a larger set of functions that are computable by a human in a simulated UTM universe than the original UTM could compute, and that there must exist a computation that a human with unlimited time and memory does that the UTM simulates that is not computable according to the original definition.
The process is best explained by the website below, but it basically uses a trick where an uncountable infinity of branches can be generated by only a countable infinity of vertices and edges (The Turing Machines.)
http://www.amirrorclear.net/academic/ideas/simulation/index.html
The problem is that unbounded time and memory starts to let you simulate things that you should have no business simulating, so if you want to avoid the consequences, you either need to constrain the Universal Turing Machine’s time or memory, or put in a condition by hand stating that you can’t simulate non-computable functions.
Another problem is that this is not reasonable to ask for when defending a universal claim:
That’s the problem, because the edge cases matter for a universal claim, like what I suspect Church and Turing were doing, as it points to why the thesis doesn’t work. As an existential claim, it would work, but as a universal claim it fails.
It looks Deutschian CTCs are similar to a computer that can produce all possible outputs in different realities, then selectively destroy the realities that don’t solve the problem. It’s not surprising that you could solve the halting problem in such a framework.
The way CTC computers actually work is because getting the right answer to a hard problem, assuming we can act on the fundamental degrees of freedom of the universe, means that this is the only logically self-consistent solution, and that getting the wrong answer, or not getting an answer at all is logically inconsistent, and thus can’t exist.