It still seems extremely difficult to create a safeguard for which there is a non-negligible chance that a GAI that specifically desired to overcome that safeguard would not be able to circumvent.
Tangentially, I feel that our failure to make provably correct programs is due to
A lack of the necessary infrastructure. Automatic theorem provers/verifiers are too primitive to be very helpful right now and it is too tedious and error-prone to prove these things by hand. In a few years, we will be able to make front-ends for theorem verifiers that will simplify things considerably.
The fact that it would be inappropriate for many projects. Requirements change and as soon as a program needs to do something different different theorems need to be proved. Many projects gain almost no value from being proven correct, so there is no incentive to do so.
It still seems extremely difficult to create a safeguard for which there is a non-negligible chance that a GAI that specifically desired to overcome that safeguard would not be able to circumvent.
Tangentially, I feel that our failure to make provably correct programs is due to
A lack of the necessary infrastructure. Automatic theorem provers/verifiers are too primitive to be very helpful right now and it is too tedious and error-prone to prove these things by hand. In a few years, we will be able to make front-ends for theorem verifiers that will simplify things considerably.
The fact that it would be inappropriate for many projects. Requirements change and as soon as a program needs to do something different different theorems need to be proved. Many projects gain almost no value from being proven correct, so there is no incentive to do so.