Oh, OK. In that case, what you are trying to achieve is (theoretically) boxing a (potential) AGI, without a gatekeeper. Which is kind of overkill in this case, and wouldn’t be solved with a choice of language anyway :)
I think it is possible to prove that a given boxing works, if it’s sufficiently simple. Choosing the language isn’t enough, but choosing the interpreter should be.
Take Brainfuck for instance: replace the dot (.″), which prints a character, by two other statements: one that prints “yes” and exits, and one that prints “no” and exits. If the interpreter has no bug, a program can only:
Print “yes” and kill itself.
Print “no” and kill itself.
Do nothing until we kill it, or otherwise fail.
Assuming the AI doesn’t control the external world by heating the host Intel processor in a smart way, we should be able to prove that we’re otherwise safe.
Oh, OK. In that case, what you are trying to achieve is (theoretically) boxing a (potential) AGI, without a gatekeeper. Which is kind of overkill in this case, and wouldn’t be solved with a choice of language anyway :)
I think it is possible to prove that a given boxing works, if it’s sufficiently simple. Choosing the language isn’t enough, but choosing the interpreter should be.
Take Brainfuck for instance: replace the dot (
.
″), which prints a character, by two other statements: one that prints “yes” and exits, and one that prints “no” and exits. If the interpreter has no bug, a program can only:Print “yes” and kill itself.
Print “no” and kill itself.
Do nothing until we kill it, or otherwise fail.
Assuming the AI doesn’t control the external world by heating the host Intel processor in a smart way, we should be able to prove that we’re otherwise safe.