Hilbert’s Triumph, Church and Turing’s failure, and what it means (Post #2)
We’ve specified what the definition of the Church-Turing Thesis means, though we’ve really specified 2 Church-Turing Theses, and the purist one is addressed in the 2.5 case, while a modified version handles the 2.6 case. We also specified what we need to do in order to actually give a truth value to either thesis. Now we will show that either 2.5 or 2.6 as a condition will lead to a contradiction, in that they define a larger set of computable functions than the original definition, leading to it’s falsification. I also address the case where a real human tries to compute all the computable functions, and the short answer is no, because otherwise we could violate the Time and Space Hierarchy Theorems, so the set of computable functions by real world humans is not equal, and strictly smaller than all computable functions as traditionally defined. This will be a long post, so get a drink and a snack, so that you can read it all.
Edit: I’m not focusing on what is possible in reality in this post, for many reasons, so you will have to wait for that in the next post.
The 2.5 case
The 2.5 case is easy enough to see, and is the closest to a classical Church-Turing Thesis as we can extend the human with unlimited time or memory, or alternatively a UTM with a CTC, similar to how other attempted extensions of Turing Machines have been made in say, non-determinism, or perhaps quantum Turing Machines or other variants of Turing Machines. Computability theorists proved that a lot of extensions and variants of Turing Machines were equivalent in power, given unlimited time and memory.
The result below is essentially that once we allow arbitrary extensions that still support the conditions defined in previous posts, there is no longer an automatic equivalence in the functions Turing Machines can compute and the functions that arbitrary extensions of them can compute.
The link below gives the gory details on how to do it, but basically they managed to figure out a way to compute the halt function, as well as more than that, by getting the algorithm to show one of 2 unique fixed points: Either the arbitrary Turing Machine halts, and thus the unique fixed point is of the induced operation on y is a singleton distribution concentrated on y = σt, and outputs halt. If the arbitrary Turing Machine loops, then instead it outputs loop for every Turing Machine that loops, and the unique fixed point is a geometric distribution, where in this example it continually halves the probability.
The third step removes any other fixed points by essentially requiring any invalid execution history to go back to the start state, via step 3 of the algorithm.
https://www.scottaaronson.com/papers/ctchalt.pdf
We now need to verify whether the constraints were satisified:
2.1 is easy enough to verify, since the expected string length is always finite, and while there’s no computable upper bound, it is always finite, if arbitrarily long. Similarly, it might have to accept strings of unbounded length, but they’re not infinitely long. That means that it’s decidable in a finite, but unbounded time and memory.
2.2 is another easy property to verify, since they haven’t changed the definition of a UTM, and thus all the power is coming from the extension, as desired.
2.3 is yet again another property that we can verify to be correct, since it either answers Halt in the case where the Turing Machine halts, or it answers Loop in the case it doesn’t halt, and both answers have unique fixed points. Also, it accepts every string in the language, which is equivalent to answering yes to the original question, and it doesn’t accept any string not in the language, and thus the answer to the original question is no, so Rafael Harth’s condition is satisfied.
2.4 is the last property to verify, and the algorithm does consist of a finite number of exact instructions. Here’s the instructions below that are copied from the paper:
Lemma 2 Halt∈ ComputableCTC. Proof. We’re given a Turing machine hPi, and need to decide whether P ( ) halts. >Suppose P ( ) runs for at least t steps; then let σt be a string that encodes the first t steps of P ( )’s execution history in some canonical way. Thus, σ0 encodes a blank input tape that hasn’t yet been acted upon, with P in its initial configuration; and σt+1 can easily be obtained from σt by appending the result of one additional P step. Note that, given an arbitrary string y, together with knowledge of hPi, it’s easy to decide whether y= σt for some t, and if so which t. Call σt a halting history if it shows P halting at its tth step, and a non-halting history otherwise.
Now our TMCTC, M, takes hPi as input in its causality-respecting register RCR, and some string y as input in its CTC register RCTC, and does the following: If y is a halting history, then leave y unchanged in the CTC register, and output “HALT”
If y = σt is a non-halting history, then set y := σt+1 with probability 1⁄2 and y := σ0 with probability 1⁄2, and output “LOOP”
If y is not a valid execution history for P ( ), then set y := σ0, and output “LOOP”
Thus, we can immediately see that the algorithm, in conjunction with the extension, decides a function that is essentially the halting problem for Turing Machines. But UTMs can’t decide their own halting problem, and thus we have expanded the set of computable functions beyond it’s original definition, meaning that the Church-Turing Thesis that a UTM can compute everything an effective function could do, as defined in my previous post, can’t be correct.
The 2.6 case
The 2.6 case is quite different, but also has some similarities. It is not, properly speaking the Church-Turing Thesis, but a modified version, and thus for Church-Turing purists, the 2.5 case will be the relevant case for you. In this case, we want to show that a UTM can simulate a human with unlimited time and memory, that can compute more functions than the UTM itself is capable of computing, thus showing that the Church-Turing Thesis as defined in this sequence is false, in that a human with unlimited time and memory that computes the halt function, as well as more general oracle input function, is simulatable by a UTM, thus meaning that the set of functions that a UTM can simulate is not equal to the functions that a UTM can compute.
The way to do this is by using an infinite branching tree model, where there’s an uncountable infinity of branches, but only a countable infinity of vertices and edges, and each branch overlaps with an uncountable infinity of other branches, meaning that a UTM can simulate the seemingly unsimulatable.
The edges correspond to the simulated moments of time, and they are countable, meaning that our parallel simulation can get to all of them. The branches correspond to the entire infinite sequence of say, a light flash.
Essentially, we will use an oracle tape, but unlike the usual oracle tapes, which can say solve the halting problem, we will only be considering an oracle tape that only has 0s. This is trivially computable by a UTM. As the oracle tape consists of only 0s, this corresponds to a light that is always off. Now consider a Turing machine that simulates two such oracle machines in parallel, but before it simulates the nth timestep for each oracle machine, it alters the representation of the configuration of the second machine to put a 1 on the nth square on its tape. Since the simulated machine couldn’t have looked at a square that far away yet, it will just behave as if it were given an infinite bitstring of 1s. In this case it will simulate a light that is always on. So the main Turing machine will be simulating a light that is always on and a light that is always off in parallel.
Now let’s add the infinite branching, and copy the C-like pseudocode from the website
initialise_om_with_room_simulation_program(om[0]) copies = 1;
for (i = 0; ; i++) { for (j = 0; j < copies; j++) { fork(om[j], om[copies+j]); set_o_tape_square(om[copies+j], i, 1); }
> copies *= 2;
> for (j = 0; j <= copies; j++)
compute_next_step(om[j]);
}
initialise_om_with_room_simulation_program(om) sets up the specified oracle machine with the transition table that will make it perform a simulation of the room with the flashing lightbulb (whose state in minute n will be determined by the nth square of its initially blank oracle tape).
set_o_tape_square(om, i, value) just alters the ith square of the oracle tape of the oracle machine to be equal to the given value.
fork(om1, om2) simply sets the entire configuration of the oracle machine om2 (its tapes, tape heads, current state, and transition table) to be the same as that of om1.
Voila! We have a finite algorithm/effective method that can be run, and that a UTM can simulate an oracle later on that can solve the halting problem.
In essence, we now have our desired simulated oracle, and that this can run a human oracle that can solve the halting with unlimited memory and time, because we can initialize the function the starting oracle machine has to have any transition table we like, and any input we like, so we can code the halting problem in the original blank oracle tape UTM, and thus we can simulate someone calculating all of the busy beaver numbers, alongside the usual computable functions.
While it will go wrong in all but one branch, this is enough to refute the Church-Turing Thesis as defined, because there must exist a branch of the simulation where it calculated a larger set of functions than the UTM can compute, thus contradicting the Church-Turing Thesis in that universe, and since it’s essentially a universally quantified variable, or perhaps a for-all condition, and thus one instance of a condition being violated is enough to break the thesis.
This works because reductionism fails here, that is while most individual bitstrings have infinite Kolmogorov complexity, the set of all such sequences of bits has very little Kolmogorov complexity. In essence, sometimes you can’t reduce something to it’s component parts without incurring infinite complexity penalties, and holism is required.
Link is below:
http://www.amirrorclear.net/academic/ideas/simulation/index.html
Credit to Scott Aaronson, Toby Ord, Mohammad Bavarian, Giulio Gueltrini, David Hilbert, Alonzo Church, Alan Turing, Rafael Harth, myself and many other researchers for both formalizing what a computer model could do, what effective functions were, and in general formalizing enough about our mathematical notion of computation and math such that I could easily see how the thesis didn’t work, and importantly gave us the ability to talk about real life computers reliably.
Even with Church and Turing’s failure, comparable to how Euclid failed in getting only one possible model of geometry, since hyperbolic geometry exists, it still can spawn many interesting questions.
Hilbert’s triumph
There were certain problems that Hilbert thought were very important to solve, like say solving a Diophantine equation or decide whether a statement in first order logic is universally valid, and he wanted to find an algorithm for them to solve in the general case using effectively calculable functions.
Today, we will be showing that the effectively calculable functions, as defined in the sequence, and allowing arbitrary extensions to a UTM in condition 2.5 that respect the constraints, leads to the decidability of a few problems that would be undecidable on a basic Universal Turing Machine, and since it only asks for satisfiability, not validity (That is it must be true in 1 model that satisfies the constraints, but we don’t require it to be true in every model. In particular, it includes satisfiability and validity problems for first order logic, and the Diophantine Equations become solvable. I don’t know whether condition 2.6 would imply all the statements below, so I have used condition 2.5 below.
Hilbert’s 10th problem is decidable by an effective function
This is basically using the MRDP theorem, or Matiyasevich’s theorem, in that all computably enumerable sets are Diophantine Equations, and the converse is true, that is all Diophantine Equations are computably enumerable, and thus in the class RE.
Since the UTM with a CTC extension contains both RE and co-RE complexity classes as decidable problems, by a generalization of the algorithm/effective function used in Scott, Mohammad and Giulio’s paper, we can immediately note that there is an algorithm for the 10th problem, which asks for the existence of an algorithm to solve the Diophantine equation, has a yes answer.
Hilbert’s Entscheidungsproblem, or alternatively the validity problem for first order logic, is decidable by an effective function
This one is trivial, as the CTC Turing Machine model can solve all problems that are Turing-reducible to the Halting problem, combined with the fact that the validity problem for first order logic is RE-complete, means that there is an algorithm for Hilbert’s Entscheidungsproblem or the validity problem for first-order logic.
Some bonus problems related to Hilbert, but that Hilbert didn’t actually declare as problems will be shown to be decidable below by me:
The unrestricted satisfiability problem for first order logic is decidable by an effective function
This is because the satisfiability problem for first order logic is decidable by an extension to the UTM, which gives us an effective function that satisfies the constraints I made back in post 1. In particular, it’s able to compute all of the co-RE complete functions, since there is a Turing reduction from the Recursively Enumerable class to the Co-Recursively Enumerable class, and the algorithm on a CTC UTM can solve all problems that are Turing reducible to the halting problem, and since the now solvable Halting problem is RE-complete, it follows that the satisfiability problem must be solvable by an effective function.
Unrestricted satisfiability in first order logic is co-RE-complete.
Finite satisfiability of first order logic is decidable by an effective function
This result essentially comes about because finite satisfiability is in the Recursively Enumerable complexity class, and we have a CTC algorithm for a UTM, because it can solve RE-complete problems like the halting problem.
Finite validity of first order logic is decidable by an effective function
Trakhtenbrot’s theorem is essentially a theorem that tells us that finite validity is not recursively enumerable, but it is co-recursively enumerable. That’s because if we could do so, we could put the problem of whether a machine does not halt in RE, which can’t be the case, and instead is in co-RE. However, since a CTC extension of a UTM can solve all problems Turing-reducible to the halting problem, which includes co-RE complete problems, finite validity of first-order logic is decidable.
Implications
The most important implications of the post today are the following, as well as some open questions for people to answer in the comments:
Computable!=Simulatable
This will have an impact on a future post, but for now, we can say that there’s a gap between what you can compute to solve a question, and what you can simulate, that is what you can do if you’re trying to just run the computation, with no expectation of an answer.
Are there other cases where simulatability doesn’t equal the ability to compute something? Are there general properties that are required in order to get this non-equivalence?
Decidability is relative to a machine/computational model, and it’s constraints, and it’s not universal
This is probably another point that people may already know, but it’s useful to know this, since this can be important, and was important in why Hilbert shouldn’t have given up entirely once he heard that the Entscheidungsproblem was undecidable for Universal Turing Machines.
This also means any claimed decidability/undecidability proof needs to make it’s assumptions clear, because otherwise you can confuse yourself.
It is not an absolute or unique concept, that is the intuitive definition splits apart into multiple non-equivalent definitions, unlike with the quotes below:
Gödel also said:
The resulting definition of the concept of mechanical by the sharp concept of “performable by a Turing machine” is both correct and unique. … Moreover it is absolutely impossible that anybody who understands the question and knows Turing’s definition should decide for a different concept. (Quoted in Wang 1996: 203)
Or this quote below:
Gödel said that “the great importance … [of] Turing’s computability” is
largely due to the fact that with this concept one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen. (Gödel 1946: 150)
More problems are decidable, but there’s an ever increasing tower of intractability
This is important, both for methods that use the basic effective functions/algorithms and extensions of Turing Machines, and for people that want to focus on very different models.
Unfortunately, the tower of intractability of problems is vast, and we don’t know if it actually stops, that is things can get ever more complex and difficult the more you’ve scaled the tower.
Hilbert was right in that we could decide with an algorithm all of 1st order logic’s validity, as well as Diophantine equations, and it turns out satisfiability is also decidable, and when we restrict to finite cases of satisfiability or validity in first order logic, they are also decidable. However, there are more problems that aren’t solved, like the totality problem under condition 2.5.
Define what your model of computation is able to do, and be careful about extensions
This is very important, and IMO this is the most important takeaway: Define what your model of computation is capable of doing, what constraints if any, and what extensions are allowed, since you can’t assume that an extension will always lead to defining the same class of computable functions in your model.
In particular, unboundedness plus a lack of many constraints can easily lead to too much freedom for what your model can do.
Be very careful of implicit or explicit universal claims
This is very important, as we quite often underestimate the difficulty of making universal claims about X.
This can be strengthened into one subpoint:
When you make a universal claim, it is not enough to prove that certain examples exist, or even that certain seemly different things are equal. You also need to think about how your conjecture or thesis can be broken, and in particular you’d need to either have good reason to believe that the counterexamples do not work, modify your conjecture, or accept that it’s false.
A perfect example here is Amalthea’s comment below:
https://www.lesswrong.com/posts/fmj35iqfJgrgsi59F/?commentId=tAXkemXFcuvrQL6it
The problem, Amalthea, is that the edge cases are exactly the problem for universal statements, akin to how the Weierstrass function disproved the assumption that all continuous functions were differentiable, and the reason I modeled Church and Turing’s thesis as a universal claim is because otherwise, some statements from the SEP describing Church and Turing’s thesis become very suspect or even false if we assume they aren’t intended to be universal.
Here’s one good example:
The Church-Turing thesis is the assertion that this set S contains every function whose values can be obtained by a method satisfying the above conditions for effectiveness.
Another good example is a quote below:
Gödel also said:
The resulting definition of the concept of mechanical by the sharp concept of “performable by a Turing machine” is both correct and unique. … Moreover it is absolutely impossible that anybody who understands the question and knows Turing’s definition should decide for a different concept. (Quoted in Wang 1996: 203)
The best example is this quote, which expresses absolute confidence, which was very unwarranted in light of the results in the post:
Gödel said that “the great importance … [of] Turing’s computability” is
largely due to the fact that with this concept one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen. (Gödel 1946: 150)
This of course, does not work in light of the new results.
Critically, the every condition changes things, as if I can prove that I can find any way to extend the set of functions by extending the UTM that is consistent with the constraints of an effective function, or I can figure out a way to simulate a universe via a UTM where humans can compute more functions than a UTM can, I have broken the thesis, as I did here.
People focusing on different models of computation are doing real computability theory work, just as today we’d call hyperbolic geometry work real geometry
In a sense, this discovery by many people above reminiscent of how people thought that Euclidean geometry was the only valid model of geometry, and the discovery of hyperbolic geometry shattered a similar implicit thesis that the only valid geometries are Euclidean geometry, and both involved very smart people assuming that their intuition was good enough to substitute for a formal treatment of the subject. Somewhat similar remarks could be made of the Weierstrass function breaking the implicit assumption that all functions that are continuous must also be differentiable.
What about if we assumed the constraints of reality?
People have asked me to keep to the constraints of reality, and while I don’t think this was Turing’s goal, given the UTM has no bounds on the time or memory required to solve a problem, which is unlikely to be the case in real life, as well as this quote below, which explicitly states that it idealizes things away, I will address the question today. It’s false there too, because if humans could compute all the computable functions in constant time and space, as humans only have a constant memory capacity and constant time, that would violate the Time and Space Hierarchy theorems, because we could solve problems that are proven to require exponential time and space in constant time and space.
Either way, the set of effective functions is larger or smaller than the original definition, and thus cannot be universally correct.
Quote is below:
Turing introduced his machines with the intention of providing an idealized description of a certain human activity, the tedious one of numerical computation.
Open questions:
Open question: How far can we go with effective functions as defined in the sequence?
We know that effective functions as defined can at least get to the classic Halting oracle tier today. The real question is, how far can we go?
The conditions are stated for completeness below:
Effectively calculable functions have the following constraints/rules:
2.1 They must always give an answer in finite time, but the time bound can be arbitrarily large, and the space required to solve a problem may be unbounded, but never infinite, just arbitrarily large.
2.2 The human is assumed to have unlimited paper to act as a memory, and unlimited time to compute (They are immortal.)
2.3 The original condition has been replaced by the following condition below from Rafael Harth:
In this setting, we call a problem decidable iff there exists a single, fixed procedure that can take arbitrary instances of this problem and always behaves like this:
If the correct answer is ‘yes’, it outputs ‘yes’. If the correct answer is ‘no’, it outputs ‘no’.
The original statement is included below for completeness:
Its instructions need only to be followed rigorously to succeed. In other words, it requires no ingenuity to succeed.[4]
2.4 This condition is quoted below:
It consists of a finite number of exact, finite instructions.
2.5 Arbitrary extensions to the UTM are valid as long as they obey the constraints above.
2.6 Alternatively, any algorithm or effective method as defined in this post is allowed to be used so long as it is simulatable by a UTM, even multiple UTMs are allowed to simulate something.
This splits into 3 subquestions below:
Open question 1: Using the constraints on effective functions in the conditions below plus condition 2.5, how many functions can we solve?
Essentially, this is asking about how far we can go with extensions.
Open question 2: Using the constraints on effective functions in the conditions below plus condition 2.6, how many functions can we simulate?
Essentially, this asks how many effective functions we can simulate with a Universal Turing Machines.
Open question 3: Do they to different functions that are effective, or do they correspond to the same functions with the same power?
This is asking whether arbitrary extensions to a UTM are just as good as simulating all the functions on a UTM. Only one extension that is equal is necessary to give a yes answer, while all possible extensions that respect the constraints cannot equal a UTM in power is required for it to be a no answer.
I’d like for others to follow in my footsteps, to at least advance what we can do, in a theoretical sense.
Emil Post was rather right here when Church possibly smuggled in a definition as a hypothesis, and it was indeed needed to be verified that it worked, and the quote is below:
Emil Post referred to Church’s identification of effective calculability with recursiveness and lambda-definability as a ‘working hypothesis’, and he quite properly criticized Church for masking this hypothesis as a definition:
[T]o mask this identification under a definition … blinds us to the need of its continual verification. (Post 1936: 105)
It’s an okay definition, albeit there are better definitions now if you just want to focus on the UTM model and only the computable functions, but as a hypothesis of all computation, it is false.
My basic, fundamental problem is people keep switching between the correct motte of the Church-Turing thesis as a definition of computability, albeit there are better definitions, and then expand it into the incorrect bailey of implying absoluteness/uniqueness/only valid or intuitive definition of computability.
In the next post, I’ll try to state what we can actually compute, over the very long run, and address the practicalities more.
Aside: I finally found where the Church-Turing Thesis came from
There’s a link to the Stanford Encyclopedia of Philosophy, and I think they’re the best resource for the Church-Turing Thesis, and the link will be below here:
I am not an expert on this, but other people have pointed out mistakes you made in this and your previous articles. You seem to misunderstand or misinterpret standard definitions, and use it to conclude that things generally accepted in computer science are wrong. My crackpot alarm is beeping very loud.
I’m strongly uncomfortable with the “crackpot” conclusion you jump to immediately. Without being an expert and just skimming through his post(s), wouldn’t the more likely conclusion be that he’s not simply arguing generally accepted things in computer science are plain wrong, but rather would be weakened under a different set of assumptions or new generalizations? Given that this particular area of computer science is often about negative results—which are actually kind of rare if you zoom out to all areas of mathematics—there are potentially going to be more weakening(s) of such negative results.
Sort of. The basic thing I’m trying to point at is that without more assumptions, the standard tools allow you too much freedom to compute things, and thus you need to restrict the model more to ensure that only the traditional computable functions are actually computable.
I think you’re getting too hung up on your personal definition of “effectiveness” which doesn’t match what other people mean by it. If you actually wanted to show that Church and Turing’s definition matched your own, I think you should quote their original papers. That said I think it’s very unlikely you’ll find that to be the case, since Turing was well aware that halting oracles
existedcould be defined and were uncomputable, so clearly he didn’t intend to include them in his definition of effective operations.Regarding open question 1, I think the answer to “how many functions can we compute in this model” is “all functions”. Since you are allowing arbitrary oracles to be used(by the branching argument), for any given function we can just use an oracle which computes that function.
Halting oracles do not exist, except Platonically as mathematical objects. We cannot build one. We cannot compute one. What did Turing say that you are describing thus?
That you can define halting oracles and prove they are uncomputable(thus he was well aware that you could mathematically define notions similar to computability which exceed the standard one)
To be fair, while I sort of grant that Turing thought that what we’d call halting oracles may be possible, I still have 2 issues:
The thesis should have been just a definition, and one among some equal definitions, rather than an all-encompassing claim (even then you’d still need to put in some more assumptions for it to work.) This is important, since while I’m fine with it being used as a definition (albeit there are more formal concepts for that), I get the sense that people want to extend it into something larger and more philosophical, and more universal than it is.
Also, the fact that Turing Machines aren’t equivalent under arbitrary extensions is a little important, as you’d need to restrict the way the Turing Machine is extended, if we want to limit our intended models to only allow the computable functions/UTMs as traditionally thought.
He still thought that a halting oracle couldn’t be a machine, and it turns out that intuition was still wrong (I’d have to admit that something weird is happening, because I interpreted that quote very differently from interstice did, since he reacted with a disagreement.)
If by “thought possible” you mean he was aware of them as a mathematically well-defined concept, there’s no doubt! He was the first to define them!
ETA: I see you reacted “I checked, it’s false” here. That’s wrong—it’s unambiguously the case that Turing was the first to define oracles machines. See here.
Well, I do think it’s pretty interesting that of all the machines we can apparently build in reality, there exists a universal machine that can emulate them all. That’s a very important fact, and highly non-obvious a priori!
I disagreed because in context I think it’s clear he meant “a machine we can build in reality”. Not “a machine which magically evaluates the problem” or “a machine which flips coins which we assume to always land in such a way that it solves the problem”.
I looked at the link, and it only states the following:
“Let us suppose we are supplied with some unspecified means of solving number-theoretic problems; a kind of oracle as it were. . . . this oracle . . . cannot be a machine. With the help of the oracle we could form a new kind of machine (call them o-machines), having as one of its fundamental processes that of solving a given number-theoretic problem.”
So as stated, it is still false, as it doesn’t even vaguely suggest a mathematically defined way to show that a halting oracle must exist. It doesn’t nearly work as a definition, and this would require a lot more development before the claim that Turing defined an o-machine mathematically was true. He at best suggests them as maybe possible, but he didn’t do any of the work required to mathematically define a halting oracle.
This is unambiguously false. If you want to read a more elaborated version, see Turing’s original paper. To a mathematician, what Turing says there is a definition—the question of how such an oracle could be realized is a totally separate one. I assure you, if you ask any computability theorist, logician, etc., they will agree that this counts as a definition.
As a general note, I don’t think you should be writing articles attempting to “disprove” Turing and Church without understanding the basics of computability theory and how theorists think about definitions and proofs. That way lies madness. I recommend reading an undergraduate textbook on computability theory in detail, doing the exercises as well, and afterwards coming back to this topic. “Computability and Logic” by Boolos is apparently pretty good.
I decided to check it, and for now, I’ll accept the definition, even with my issues.
Okay, a crux that I have can be stated as this:
I do not believe Turing actually wanted to do this, in the sense that he wanted to focus on real machines, and I believe that his comments instead suggest that he wanted to focus on ideal machines, rather than machines that could be implemented in reality.
In particular, I’ve generated some quotes that suggest that the idealization was what Turing was focusing on.
Also, I tend to read things conservatively, such that I don’t assume much context beyond what is said, and I don’t think Turing was ever concerned with what we could compute in reality, instead of idealizations.
This is definitely important, but this is more so due to the unbounded space and time allowed, as well as some parallelism.
To make it even better, in this post I showed that you could emulate any arbitrary oracle tape on any input with Turing Machines and thus this lets us emulate even halting oracles, or even let us simulate a universe where humans can calculate exact real numbers.
So this will hold in way more situations than you think, that even if it turns out we can compute the halting problem in real life, there will still be a Universal Turing Machine that simulates it.
It’s impressive and important as a discovery though.
Edit: I looked at the link, and it only states the following:
So as stated, it is still false, as it doesn’t even vaguely suggest a mathematically defined way to show that a halting oracle must exist. It doesn’t nearly work as a definition, and this would require a lot more development before the claim that Turing defined an o-machine mathematically was true. He at best suggests them as maybe possible, but he didn’t do any of the work required to mathematically define a halting oracle.
In general, I think this is IMO my general model of Turing: He had some rather interesting intuitions, but damn did he need a lot of work to make them more formal, whether in the case of the Church-Turing thesis as a definition, or on o-machines, in general things rested way the fuck too much on intuitions, and to a certain extent computability theory has suffered from the overfocus of intuition on the founders, comparable to the issues of geometry because people were too enamored on Euclid’s intuitions.
Original comment: This is overstating what he did. He did think about o-machines, but he never actually made a math model of the o-machines, nor did he ever define it beyond a very basic intuition. Scott Aaronson, Mohammad Bavarian, and Giulio Gueltrini actually did the work to create an o-machine constructively, and even if we allow non-constructive definitions, Turing and Church was definitely very far from actually showing that a halting oracle exists. That one had to be done by other researchers.
While he was aware of the possibility of o-machines, the problem here that arises is that given Turing’s focus on idealizations of how humans calculated, that other people needed to be much more careful in their claims, because you cannot make a universal claim that holds without also addressing how the purported counterexamples don’t hold up.
And while Turing gave some intuition for why the Church-Turing Thesis should hold, intuitions are not proof, for the same reasons that the intuitive notion of continuous functions always being differentiable was disproved.
Yes, Turing machines are an idealization, but they’re meant to be an idealization of processes of computation we can carry out in the real world. That’s why computability is a useful notion to study, it corresponds to something we care about in reality. In contrast, a notion of computation which includes time travel or arbitrary oracles is less useful, because as far as we know those don’t exist. Regarding the point about time boundedness, yes Turing-computability does not exactly correspond to things we will be able to compute in reality(probably), but it’s a lot closer to reality than models of computation with oracles or time travel. Similarly, the Earth is not exactly spherical, but models which treat the Earth as spherical are a lot closer to reality than models which treat space as having 7 dimensions.
I think this might be the core issue with your argument: the Church-Turing thesis is not the sort of thing you can prove or disprove mathematically. It’s an attempt at matching an informal concept with a set of axioms. You can’t prove or disprove a set of axioms, you can just choose sets which are more or less useful based on philosophical argument, empirical evidence, etc. Everybody who studies computability is aware that you can define alternative models of computation and study their properties, it’s just that those other models are considered less important because they can’t actually be built. If you want to convince people to change their notion of computability, you have to argue that your new definition is more useful for modeling something in reality they care about, trying to pull a ‘gotcha’ based on a list of requirements you made up yourself is pointless.
The problem is by idealizing that away, we have made it much less useful. Universal Turing Machines as Turing defined them are basically in a similar class to ideas like arbitrary oracles or time travel: they require assumptions that basic physical constraints are fundamentally wrong.
In other words, idealization buys us too much power.
In particular, I disagree with this heavily:
Yes, technically UTMs are a little better at modeling computation, but it’s still much much farther than likely reality (this is so here), because it relies on certain assumptions being wrong, and any way this happens would likely imply that the other models of computation suddenly become more plausible. It’s a little less wrong, but there are other models which are way more accurate. In other words, the idealization is so unrealistic
If you want to focus on real life computation without having to leave theory, computational complexity already does that job. If you want to focus on near future reality, then complexity theory is best while not having to deal with a lot of messiness.
I have two things to say here:
While Euclid’s axioms of geometry weren’t disproven, the intuitive notion that this represented the only valid/true geometry was broken, and I see a similar issue here.
Even then, there are other ways to get to the UTM using more formal definitions, and in the case where you just want to focus on UTMs, that’s fine, but don’t try to imply that it’s somehow universal/the only true model etc.
Also, how you do this: “It’s an attempt at matching an informal concept with a set of axioms.” can matter a lot, and while I expect some intuitions to focus on the UTM, others won’t, and instead get drastically different results.
I think a big difference between us is that I don’t expect computability theory to model our reality at all, just what is possible in the multiverse of logic, whereas you do expect computability theory to model our reality, and thus I’m fine with many of my results even if they can’t be applied in reality, since I generally think that even the computable stuff isn’t useful at all in applied reality, and thus I usually focus on arbitrary idealizations when focusing on computability theory.
You disagree with that because you do expect computability theory to model our reality.
If I wanted to model computational reality, I’d use very different tools than computability theory to model it.
Edit:
It certainly wasn’t obvious to Turing, Church and Godel et al. I can definitely agree that modern people studying computability theory may understand this, at least implicitly, but I don’t think that happened with the original founders of computability theory, and the link you gave me is quite insufficient, in that ignoring the other falsities that Turing made in the quote, it didn’t even define the concept beyond a very basic level.
Other researchers had to do this.
For the purposes of this post, we are not much concerned about the constraints of our specific reality, for 3 reasons:
If we accept this argument, then it’s unlikely that a true Universal Turing Machine exists in real life either, because it doesn’t have time bounds or memory bounds. So we are already idealizing things away to make it work at all.
If we did use real humans computing/counting things, rather than idealized human computing/counting things, then the problem becomes the fact that we only have a finite, constant memory and time, and the functions/algorithms that are in the computable set as defined by a UTM does not collapse to the set of functions calculable by a human. If they did, such that human people in real life could calculate all the computable functions in constant time and memory, that would violate the time and space hierarchy theorems. I literally addressed this.
While we think that halting oracles can’t exist, and there are many good reasons to believe that, it doesn’t totally mean that we figured things out yet, and while I basically agree that it’s unlikely, I am unwilling to make very firm conclusions about the very far future, including this one.
Even then, I’m still a bit mystified, in that I don’t understand why people treat it like an absolute notion that applies to all computational models, rather than simply a definition, and one that can be replaced by other equivalent or different definitions like this quote below:
or like this quote below:
Putting it less politely, my biggest issue is a motte and bailey is going on, where people want to defend the correct motte of the Church-Turing thesis as a definition, albeit there are other equivalent ones, but people want to go beyond that and defend the incorrect bailey that it’s absolute/universal/the only valid definition of something, similar to how people thought the intuitive definition of Euclidean Geometry was the only possible valid geometry, at least implicitly.
I will check out Turing and Church’s papers, though I will say that for the conditions 2.5 and 2.6, they’re there to express two ideas below:
2.5 was essentially a statement explicitly saying that no other constraints to the problem were added, other than the constraints above.
2.6 was not Turing’s idea or problem, and instead it’s a modern problem, I’ll have to edit the post to say that this is not something that Turing thought about.
And condition 2.3 is essentially replacing the rigorousness of something instead of creativity by instead asking whether a single algorithm can solve arbitrary instances of a problem, like in a decision problem where the answers are yes or no, I require the algorithm to always say yes if it’s in the set, and no if it’s not in the set, and that this must be doable on arbitrary instances of a problem. in other words given that algorithm applied to the problem, you don’t need to think about it, you can just be rigorous, in the sense that you apply the same algorithm for arbitrary instances of a problem, and can follow it by rote, and you don’t need to have a different algorithm.
This seems like it’s instead an answer to Open Question 2, in that we can simulate all oracle tapes on all inputs, finite or infinite, which is I think equivalent to solving all the functions from N to N, and that’s the set of all programs. I was also asking for an upper bound, too if you can supply it, but I’ll take it as a partial answer.
Is there a reason that allowing arbitrary extensions to the UTM that respect the constraints of an effective function does the same thing? That is, if we give the UTM arbitrary extensions, can we too get it to solve say all the functions from N to N via arbitrary extensions of the UTM model, and is it upper bounded by all functions from N to N?
Edit: I’ve decided to share some preliminary results of my checking, and I already see some issues below:
For example, this statement:
Is correct, but misses major nuance, and importantly assumes it’s not a machine. That’s a problem, because without more assumptions, you can’t exclude halting oracles from your definition of effective operations. He assumed that there could be an unspecified oracle, but the problem is in the statement below, he falsely believes it’s not a machine that calculates effective functions, when it is, and critically he didn’t know that you could make the model more concrete into an actual machine. To be clear, I don’t blame Turing too much, as the results in the post weren’t known, and Church and Turing made the entire foundation from scratch, so it was inevitable that mistakes would be made.
This is from the paper Systems of Logic Based On Ordinals, link is below:
https://pure.mpg.de/rest/items/item_2403325_2/component/file_2403324/content
There may be other problems, but that’s just the first problem I found.