Probably not, because it’s hard to get a general intelligence to make consistently wrong decisions in any capacity. Partly because, like you or me, it might realize that it has a design flaw and work around it.
A better plan is just to explicitly bake corrigibility guarantees (i.e. the stop button) into the design. Figuring out how to do that that is the hard part, though.
Probably not, because it’s hard to get a general intelligence to make consistently wrong decisions in any capacity. Partly because, like you or me, it might realize that it has a design flaw and work around it.
A better plan is just to explicitly bake corrigibility guarantees (i.e. the stop button) into the design. Figuring out how to do that that is the hard part, though.