Or, why you need to fix a machine before you can prove anything, and you also need to fix the constraints on the machine. This also holds importantly for problems claimed to be decidable or undecidable by algorithms.
Basically, the reason here is that different machine/computational models define different sets of computable functions, and you cannot treat the machines as equivalent in power.
This is admittedly something that most people get implicitly today, but it can definitely cause problems, and it certainly caused a lot of problems for Church and Turing et al in that they incorrectly jumped to the conclusion that the Turing Machines could compute any possible computer, or compute any possible algorithm, probably because they thought that you could treat a Turing Machine as equivalent to any other machine. Why the Church-Turing thesis is false, in the sense that it doesn’t apply universally will be covered in an upcoming post in this sequence, but for now take it as a fact that there are different models of computation that define different computable sets, or equivalently different problems become decidable when new models are added.
Edit: I’m focusing on a variant of the thesis for the next post, in which I focus not on what’s possible given our mathematical structure of reality, but whether it’s possible to exceed the Turing Machine at all in any possible mathematical structure, and another variant where we restrict it to Turing-equivalent models, but we can arbitrarily change what we can offer the Turing Machine.
This is important, since I’m mostly going in a philosophical, not practical direction, and the thesis made no reference to our specific mathematical reality at all, so it’s important.
From wikipedia:
“In computability theory, the Church–Turing thesis (also known as computability thesis,[1] the Turing–Church thesis,[2] the Church–Turing conjecture, Church’s thesis, Church’s conjecture, and Turing’s thesis) is a thesis about the nature of computable functions. It states that a function on the natural numbers can be calculated by an effective method if and only if it is computable by a Turing machine. ”
The most important takeaway from this sequence is that it really does matter what computer/machine model you use, and what constraints you add to the model, and Turing Machines aren’t always the best model to use, and that you shouldn’t be eliding this in a proof of something as an informal statement, as otherwise it maps to different computable sets, which you don’t want to do in a proof.
The best example of this phenomenon here is the following below:
The difference between a TM with a Deutschian CTC, and a TM without a Deutschian CTC in what problems they can solve (Not what the machines can simulate!)
It’s been proven by Scott Aaronson, Mohammad Bavarian, and Giulio Gueltrini below that the set of problems a Turing Machine could solve with unbounded time and memory with a Deutschian CTC is larger than the set of problems that could be solved without a Deutschian CTC by a Turing Machine with unlimited time and memory.
Specifically, the set of problems solvable with a Deutschian CTC includes the halt function, as well as the set of problems that are Turing-reducible to the halting problem.
It’s a classic halting oracle, but now it’s constructed and visualized.
The link is below, but the main operators here are essentially a new lemma, plus some theorems that establish that in the setting of Turing Machines with CTCs both contains the halt oracle, plus is contained in the halt oracle.
The importance of this result is that you can’t arbitrarily change a Turing Machine, and then define the same sets of computable functions. It gives motivation to why it’s important to distinguish different intuitive/formal machine models, and why you need to fix the machine model, as well as constraints on that model, so that you can talk about what problems are decidable or intractable, so you don’t end up in the trap of Church and Turing et al, who thought you could capture all of computation as a Turing Machine.
In the next post, I’ll talk about Hilbert’s triumph, Church and Turing’s failure and why it turned out that there are machines that can decide first order logic, which is a huge chunk of math.
An aside on CTC Turing Machines, AIXI, and Solonomoff Induction et al
While I will probably make a post on this sequence about comparing what different computer models can do, it is worth noting that AIXI and Solonomoff Inductors can actually be weaker or stronger than the CTC Turing Machines, depending on whether you allow them to solve all problems that are only many-one reducible to the halting problem, or allow them to solve all problems that are Turing-reducible to the halting problem, or allow them to solve problems that are higher in the arithmetical hierarchy. I just want to include this, since depending on how much power you assume AIXI or Solonomoff Inductors have, you can get very different results.
Major edit on the Church-Turing Thesis
For the purposes of the Church-Turing Thesis, I will define the claim by Church and Turing like this, to translate what I think they wanted to conjecture.
The Church-Turing Thesis/Conjecture is captured by the first point below:
The Universal Turing Machine, or equivalently a human with unlimited time and memory can compute all the functions that are calculable by effective functions, as defined below.
Effectively calculable functions have the following constraints/rules:
2.1 They must always give an answer in finite time, but the time bound can be arbitrarily large, and the space required to solve a problem may be unbounded, but never infinite, just arbitrarily large.
2.2 The human is assumed to have unlimited paper to act as a memory, and unlimited time to compute (They are immortal.)
2.3 The original condition has been replaced by the following condition below from Rafael Harth:
In this setting, we call a problem decidable iff there exists a single, fixed procedure that can take arbitrary instances of this problem and always behaves like this:
If the correct answer is ‘yes’, it outputs ‘yes’.
If the correct answer is ‘no’, it outputs ‘no’.
The original statement is included below for completeness:
Its instructions need only to be followed rigorously to succeed. In other words, it requires no ingenuity to succeed.[4]
2.4 This condition is quoted below:
It consists of a finite number of exact, finite instructions.
2.5 Arbitrary extensions to the UTM are valid as long as they obey the constraints above.
2.6 Alternatively, any algorithm or effective method as defined in this post is allowed to be used so long as it is simulatable by a UTM, even multiple UTMs are allowed to simulate something.
Edit: 2.5 is essentially the classical Church-Turing thesis, while 2.6 is not the proper Church-Turing thesis, properly speaking, it’s instead a useful modification.
The question becomes: Does the constraints on the human or UTM always lead to the same set, or can it lead to a larger set of computable functions? We will prove later on that it can lead to a larger set of computable functions.
We will then prove in a later post that conditions 2.5 or 2.6 will lead to a larger set of computable functions than a UTM can compute, and thus the Church-Turing Thesis is false. However, by dropping both conditions, and by adding a new condition by hand that restricts what we can do with effective functions, it can be true.
Something that might help in understanding decidability
An important snippet from Rafael Harth is generally useful and doesn’t require much in the way of time investment.
When mathematicians talk about a ‘problem’ in this context, they always mean “an infinite set of problems such that (a) the answer to each problem is either ‘yes’ or ‘no’, and (2) the set contains arbitrarily large instances of problems”. Here are two examples:
Given an arbitrarily large natural number, decide if it is a prime number.
Given an arbitrarily long program code, decide if the program, if run, eventually halts.
Why you can’t treat decidability and complexity as a constant (Post #1)
Or, why you need to fix a machine before you can prove anything, and you also need to fix the constraints on the machine. This also holds importantly for problems claimed to be decidable or undecidable by algorithms.
Basically, the reason here is that different machine/computational models define different sets of computable functions, and you cannot treat the machines as equivalent in power.
This is admittedly something that most people get implicitly today, but it can definitely cause problems, and it certainly caused a lot of problems for Church and Turing et al in that they incorrectly jumped to the conclusion that the Turing Machines could compute any possible computer, or compute any possible algorithm, probably because they thought that you could treat a Turing Machine as equivalent to any other machine. Why the Church-Turing thesis is false, in the sense that it doesn’t apply universally will be covered in an upcoming post in this sequence, but for now take it as a fact that there are different models of computation that define different computable sets, or equivalently different problems become decidable when new models are added.
Edit: I’m focusing on a variant of the thesis for the next post, in which I focus not on what’s possible given our mathematical structure of reality, but whether it’s possible to exceed the Turing Machine at all in any possible mathematical structure, and another variant where we restrict it to Turing-equivalent models, but we can arbitrarily change what we can offer the Turing Machine.
This is important, since I’m mostly going in a philosophical, not practical direction, and the thesis made no reference to our specific mathematical reality at all, so it’s important.
From wikipedia:
Link is below:
https://en.wikipedia.org/wiki/Church–Turing_thesis
The most important takeaway from this sequence is that it really does matter what computer/machine model you use, and what constraints you add to the model, and Turing Machines aren’t always the best model to use, and that you shouldn’t be eliding this in a proof of something as an informal statement, as otherwise it maps to different computable sets, which you don’t want to do in a proof.
The best example of this phenomenon here is the following below:
The difference between a TM with a Deutschian CTC, and a TM without a Deutschian CTC in what problems they can solve (Not what the machines can simulate!)
It’s been proven by Scott Aaronson, Mohammad Bavarian, and Giulio Gueltrini below that the set of problems a Turing Machine could solve with unbounded time and memory with a Deutschian CTC is larger than the set of problems that could be solved without a Deutschian CTC by a Turing Machine with unlimited time and memory.
Specifically, the set of problems solvable with a Deutschian CTC includes the halt function, as well as the set of problems that are Turing-reducible to the halting problem.
It’s a classic halting oracle, but now it’s constructed and visualized.
The link is below, but the main operators here are essentially a new lemma, plus some theorems that establish that in the setting of Turing Machines with CTCs both contains the halt oracle, plus is contained in the halt oracle.
https://www.scottaaronson.com/papers/ctchalt.pdf
The importance of the result
The importance of this result is that you can’t arbitrarily change a Turing Machine, and then define the same sets of computable functions. It gives motivation to why it’s important to distinguish different intuitive/formal machine models, and why you need to fix the machine model, as well as constraints on that model, so that you can talk about what problems are decidable or intractable, so you don’t end up in the trap of Church and Turing et al, who thought you could capture all of computation as a Turing Machine.
In the next post, I’ll talk about Hilbert’s triumph, Church and Turing’s failure and why it turned out that there are machines that can decide first order logic, which is a huge chunk of math.
An aside on CTC Turing Machines, AIXI, and Solonomoff Induction et al
While I will probably make a post on this sequence about comparing what different computer models can do, it is worth noting that AIXI and Solonomoff Inductors can actually be weaker or stronger than the CTC Turing Machines, depending on whether you allow them to solve all problems that are only many-one reducible to the halting problem, or allow them to solve all problems that are Turing-reducible to the halting problem, or allow them to solve problems that are higher in the arithmetical hierarchy. I just want to include this, since depending on how much power you assume AIXI or Solonomoff Inductors have, you can get very different results.
Major edit on the Church-Turing Thesis
For the purposes of the Church-Turing Thesis, I will define the claim by Church and Turing like this, to translate what I think they wanted to conjecture.
The Church-Turing Thesis/Conjecture is captured by the first point below:
The Universal Turing Machine, or equivalently a human with unlimited time and memory can compute all the functions that are calculable by effective functions, as defined below.
Effectively calculable functions have the following constraints/rules:
2.1 They must always give an answer in finite time, but the time bound can be arbitrarily large, and the space required to solve a problem may be unbounded, but never infinite, just arbitrarily large.
2.2 The human is assumed to have unlimited paper to act as a memory, and unlimited time to compute (They are immortal.)
2.3 The original condition has been replaced by the following condition below from Rafael Harth:
The original statement is included below for completeness:
2.4 This condition is quoted below:
2.5 Arbitrary extensions to the UTM are valid as long as they obey the constraints above.
2.6 Alternatively, any algorithm or effective method as defined in this post is allowed to be used so long as it is simulatable by a UTM, even multiple UTMs are allowed to simulate something.
Edit: 2.5 is essentially the classical Church-Turing thesis, while 2.6 is not the proper Church-Turing thesis, properly speaking, it’s instead a useful modification.
The question becomes: Does the constraints on the human or UTM always lead to the same set, or can it lead to a larger set of computable functions? We will prove later on that it can lead to a larger set of computable functions.
We will then prove in a later post that conditions 2.5 or 2.6 will lead to a larger set of computable functions than a UTM can compute, and thus the Church-Turing Thesis is false. However, by dropping both conditions, and by adding a new condition by hand that restricts what we can do with effective functions, it can be true.
Something that might help in understanding decidability
An important snippet from Rafael Harth is generally useful and doesn’t require much in the way of time investment.