I’m confused by your desire for an ‘automatic controlled shutdown’ and your fear that further meta-reasoning will override ethical inhibitions. In previous writings you’ve expressed a desire to have a provably correct solution before proceeding. But aren’t you consciously leaving in a race-condition here?
What’s to prohibit the meta-reasoning from taking place before the shutdown triggers? It would seem that either you can hard-code an ethical inhibition or you can’t. Along those lines, is it fair to presume that the inhibitions are always negative, so that non-action is the safe alternative? Why not just revert to a known state?
Eliezer ---
I’m confused by your desire for an ‘automatic controlled shutdown’ and your fear that further meta-reasoning will override ethical inhibitions. In previous writings you’ve expressed a desire to have a provably correct solution before proceeding. But aren’t you consciously leaving in a race-condition here?
What’s to prohibit the meta-reasoning from taking place before the shutdown triggers? It would seem that either you can hard-code an ethical inhibition or you can’t. Along those lines, is it fair to presume that the inhibitions are always negative, so that non-action is the safe alternative? Why not just revert to a known state?