Each time you create a successor or equivalently self-modify, or rather each time that an action that you take has a non-zero chance of modifying yourself? Wouldn’t that mean you’d have to check constantly? What if the effect of an action is typically unpredictable at least to some degree?
Also, since any system is implemented using a physical substrate, some change is inevitable (and until the AI has powered up to multiple redundancy level not yet so stochastically unlikely as to be ignorable). What happens if that change affects the “prove the system sound” physical component, is there a provable way to safeguard against that? (Obligatory “who watches the watcher”.)
It’s a hard problem any way you slice it. You there, go solve it.
Each time you create a successor or equivalently self-modify, or rather each time that an action that you take has a non-zero chance of modifying yourself? Wouldn’t that mean you’d have to check constantly? What if the effect of an action is typically unpredictable at least to some degree?
Also, since any system is implemented using a physical substrate, some change is inevitable (and until the AI has powered up to multiple redundancy level not yet so stochastically unlikely as to be ignorable). What happens if that change affects the “prove the system sound” physical component, is there a provable way to safeguard against that? (Obligatory “who watches the watcher”.)
It’s a hard problem any way you slice it. You there, go solve it.