But we are not interested in arbitrary programs. We are interested in well-written programs accompanied by a proof of their correctness. Checking such a proof is not only a feasible task, it is an almost trivial task.
Interesting! I have read that many mathematical proofs today require computer analysis. If it is such a trivial task to check the correctness of code that gives rise to the level of intelligence above ours, then my questions have not been relevant regarding the argument of the original post. I mistakenly expected that no intelligence is able to easily verify conclusively the workings of another intelligence that is a level above its own without painstakingly acquiring resources, devising the necessary tools and building the required infrastructure. I expected that what applied for humans creating superhuman AGI would also hold for AGI creating its successor. Humans first had to invent science, bumble through the industrial revolution and develop computers to be able to prove modern mathematical problems. So I thought a superhuman AGI would have to invent meta-science, advanced nanotechnology and figure out how to create an intelligence that could solve problems it couldn’t solve itself.
If it is such a trivial task to check the correctness of code that gives rise to the level of intelligence above ours …
Careful! I said checking the validity of a proof of code correctness, not checking the correctness of code. The two are very different in theory, but not (I conjecture) very different in practice, because code is almost never correct unless the author of the code has at least a sketch proof that it does what he intended.
I see, it wasn’t my intention to misquote you, I was simply not aware of a significant distinction.
By the way, has the checking of validity of a proof of code correctness to be validated or proven as well or is there some agreed limit to the depth of verification sufficient to be sure that if you run the code nothing unexpected will happen?
The usual approach, taken in tools like Coq and HOL, is to reduce the proof checker to a small component of the overall proof assistant, with the hope that these small pieces can be reasonably understood by humans, and proved correct without machine assistance. This keeps the human correctness-checking burden low, while also keeping the piece that must be verified in a domain that’s reasonable for humans to work with, as opposed to, say, writing meta-meta-meta proofs.
The consequence of this approach is that, since the underlying logic is very simple, the resulting proof language is very low-level. Human users give higher-level specifications, and the bulk of the proof assistant compiles those into low-level proofs.
There are a couple of branches worth mentioning in the “meta-proof”. One is the proof of correctness of the proof-checking software. I believe that fiddlemath is addressing this problem. The “specification” of the proof-checking software is essentially a presentation of an “inference system”. The second is the proof of the soundness of the inference system with respect to the programming language and specification language. In principle, this is a task which is done once—when the languages are defined. The difficulty of this task is (AFAIK) roughly like that of a proof of cut-elimination—that is O(n^2) where n is the number of productions in the language grammars.
Interesting! I have read that many mathematical proofs today require computer analysis. If it is such a trivial task to check the correctness of code that gives rise to the level of intelligence above ours, then my questions have not been relevant regarding the argument of the original post. I mistakenly expected that no intelligence is able to easily verify conclusively the workings of another intelligence that is a level above its own without painstakingly acquiring resources, devising the necessary tools and building the required infrastructure. I expected that what applied for humans creating superhuman AGI would also hold for AGI creating its successor. Humans first had to invent science, bumble through the industrial revolution and develop computers to be able to prove modern mathematical problems. So I thought a superhuman AGI would have to invent meta-science, advanced nanotechnology and figure out how to create an intelligence that could solve problems it couldn’t solve itself.
Careful! I said checking the validity of a proof of code correctness, not checking the correctness of code. The two are very different in theory, but not (I conjecture) very different in practice, because code is almost never correct unless the author of the code has at least a sketch proof that it does what he intended.
I see, it wasn’t my intention to misquote you, I was simply not aware of a significant distinction.
By the way, has the checking of validity of a proof of code correctness to be validated or proven as well or is there some agreed limit to the depth of verification sufficient to be sure that if you run the code nothing unexpected will happen?
The usual approach, taken in tools like Coq and HOL, is to reduce the proof checker to a small component of the overall proof assistant, with the hope that these small pieces can be reasonably understood by humans, and proved correct without machine assistance. This keeps the human correctness-checking burden low, while also keeping the piece that must be verified in a domain that’s reasonable for humans to work with, as opposed to, say, writing meta-meta-meta proofs.
The consequence of this approach is that, since the underlying logic is very simple, the resulting proof language is very low-level. Human users give higher-level specifications, and the bulk of the proof assistant compiles those into low-level proofs.
There are a couple of branches worth mentioning in the “meta-proof”. One is the proof of correctness of the proof-checking software. I believe that fiddlemath is addressing this problem. The “specification” of the proof-checking software is essentially a presentation of an “inference system”. The second is the proof of the soundness of the inference system with respect to the programming language and specification language. In principle, this is a task which is done once—when the languages are defined. The difficulty of this task is (AFAIK) roughly like that of a proof of cut-elimination—that is O(n^2) where n is the number of productions in the language grammars.