That’s nice, but I don’t currently believe there are any audits or protocols that can prove future AIs safe “beyond a reasonable doubt”.
I think you can do this with a capabilities test (e.g. ARC’s), just not with an alignment test (yet).
There’s a way to extend one into the other with certain restrictions. (Stateless, each input is from the latent space of the training set or shutdown if machine outputs are important, review of plans by other AIs)
I think you can do this with a capabilities test (e.g. ARC’s), just not with an alignment test (yet).
There’s a way to extend one into the other with certain restrictions. (Stateless, each input is from the latent space of the training set or shutdown if machine outputs are important, review of plans by other AIs)