Your expectation only holds if you expect failure to be perfectly correlated or multiple layers actually reduce the strength of layers, otherwise the probability of the AI beating layers A and B necessarily is less than beating just A or B (A ^B < A v B).
That’s true. However I would expect a transhuman to be able to find a single point of failure which does not even occur to our limited minds, so this perfect correlation is a virtual certainty.
Now you’re just ascribing magical powers to a potentially-transhuman AI. I’m sure there exists such a silver bullet, in fact by definition if security isn’t 100%, that’s just another way of saying there exists a strategy which will work; but that’s ignoring the point about layers of security not being completely redundant with proofs and utility functions and decision theories, and adding some amount of safety.
That’s true. However I would expect a transhuman to be able to find a single point of failure which does not even occur to our limited minds, so this perfect correlation is a virtual certainty.
Now you’re just ascribing magical powers to a potentially-transhuman AI. I’m sure there exists such a silver bullet, in fact by definition if security isn’t 100%, that’s just another way of saying there exists a strategy which will work; but that’s ignoring the point about layers of security not being completely redundant with proofs and utility functions and decision theories, and adding some amount of safety.
Disengaging.