If you make an agent that uses a different set of rules when considering changes to itself, the agent still might create successor agents or indirectly cause it’s code to be changed, in ways that follow the original object-level rules, because those decisions fall under object-level reasoning.
Working within the ai control problem context.
The decision to allow the agent control of the object-level decisions was done on the meta-level, so it was the best option the meta-level could find. If the meta-level was stronger it would only allow object-level decider that would respects its meta-level decisions. So you are making the assumption that the object-level is stronger than the meta-level in its powers of prediction.
In the context of my work on agoric computing that is probably true. But it is not true for all meta-level decision theories.
I think this line of reasoning is the distinction between the normal computer control problem and the ai control problem.
Working within the ai control problem context.
The decision to allow the agent control of the object-level decisions was done on the meta-level, so it was the best option the meta-level could find. If the meta-level was stronger it would only allow object-level decider that would respects its meta-level decisions. So you are making the assumption that the object-level is stronger than the meta-level in its powers of prediction.
In the context of my work on agoric computing that is probably true. But it is not true for all meta-level decision theories.