Indeed, provided that the automated proof checker is absolutely entirely utterly bug-free and exploit-free! When was the last time someone released fairly complicated, mission-critical piece of software with an assertion that it was totally secure and bugless… and wasn’t proven wrong afterwards?
But how do we know they are bug-free and exploit-free?
I’m under the impression that current software methodologies can at least empirically guarantee error rates like 1 per million lines of code [ETA: this is wrong, should be 1 per 10,000, see child]. Since an automated proof checker requires far fewer lines than this, we can probably guarantee no errors with something like probability 99.9%, which isn’t perfect, but then again a proof of friendliness is unlikely to give much better guarantees, since it might be wrong.
We can potentially add to this by not letting the AI know anything about the checker (this would actually be non-trivial, since the AI gets access to whatever information it wants about the outside world; I don’t actually know how to solve this problem).
Thanks, that does bring my estimate down by about two orders of magnitude (to 1 error / 10,000 lines being the current standard for “very good” code). I think I mis-remembered the NASA one as 1 per million instead of 1 per 500,000 and thought that it was more typical than it was.
Still, 1 error per 10,000 seems to be definitely achievable today, and can probably be made higher if we need it. Hopefully by the time we’re dealing with AGI we will have better metholodogies.
An automated proof checker?
Indeed, provided that the automated proof checker is absolutely entirely utterly bug-free and exploit-free! When was the last time someone released fairly complicated, mission-critical piece of software with an assertion that it was totally secure and bugless… and wasn’t proven wrong afterwards?
Automated proof checking is actually very simple, and we already have bug-free and exploit-free proof checkers.
I was not expecting this answer at all! Can you elaborate a little further? Even just a link to such a checker would be great.
http://en.wikipedia.org/wiki/Automated_proof_checking seems to have several such links.
But how do we know they are bug-free and exploit-free?
I’m under the impression that current software methodologies can at least empirically guarantee error rates like 1 per million lines of code [ETA: this is wrong, should be 1 per 10,000, see child]. Since an automated proof checker requires far fewer lines than this, we can probably guarantee no errors with something like probability 99.9%, which isn’t perfect, but then again a proof of friendliness is unlikely to give much better guarantees, since it might be wrong.
We can potentially add to this by not letting the AI know anything about the checker (this would actually be non-trivial, since the AI gets access to whatever information it wants about the outside world; I don’t actually know how to solve this problem).
Here’s some evidence for you to update on.
Thanks, that does bring my estimate down by about two orders of magnitude (to 1 error / 10,000 lines being the current standard for “very good” code). I think I mis-remembered the NASA one as 1 per million instead of 1 per 500,000 and thought that it was more typical than it was.
Still, 1 error per 10,000 seems to be definitely achievable today, and can probably be made higher if we need it. Hopefully by the time we’re dealing with AGI we will have better metholodogies.