The AI can be adapted for other, less restricted, domains
That the ideas from a safe AI can be used to build an unsafe AI is a general argument against working on (or even talking about) any kind of AI whatsoever.
The AI adds code that will evolve into another AI into it’s output
The output is to contain only proofs of theorems. Specifically, a proof (or refutation) of the theorem in the input. The state of the system is to be reset after each run so as to not accumulate information.
The AI could self-modify incorrectly and result in unfriendly AI
Any correct or incorrect self-modification is still restricted to the math domain, and so cannot result in an unsafe AI.
bug in the environment itself
Guarding against software bugs is easy in this case. You design an abstract virtual machine environment for the AI, then design the software that implements this environment, then formally prove that the software is correct.
Guarding against errors caused by cosmic rays is also easy. You estimate the probability of such an error, and then add redundancy until the probability is so low that it won’t happen until the end of the universe.
Look up how difficult it is to sandbox untrusted code
Sandboxing untrusted code is easy. The difficult thing is sandboxing it while making it think that it runs normally. This is irrelevant here.
I don’t believe that a system can work only in formal proofs
It is quite probable that a “pure math Oracle” system cannot work. The point was, it can be made safe to try.
That the ideas from a safe AI can be used to build an unsafe AI is a general argument against working on (or even talking about) any kind of AI whatsoever.
The output is to contain only proofs of theorems. Specifically, a proof (or refutation) of the theorem in the input. The state of the system is to be reset after each run so as to not accumulate information.
Any correct or incorrect self-modification is still restricted to the math domain, and so cannot result in an unsafe AI.
Guarding against software bugs is easy in this case. You design an abstract virtual machine environment for the AI, then design the software that implements this environment, then formally prove that the software is correct. Guarding against errors caused by cosmic rays is also easy. You estimate the probability of such an error, and then add redundancy until the probability is so low that it won’t happen until the end of the universe.
Sandboxing untrusted code is easy. The difficult thing is sandboxing it while making it think that it runs normally. This is irrelevant here.
It is quite probable that a “pure math Oracle” system cannot work. The point was, it can be made safe to try.