I think the problem that the idea of doing special meta-reasoning runs into is that there’s no clear line between meta-level decisions and object-level decisions.
“Will changing this line of code be good or bad?” is the sort of question that can be analyzed on the object level even if that line of code is part of my source. If you make an agent that uses a different set of rules when considering changes to itself, the agent still might create successor agents or indirectly cause it’s code to be changed, in ways that follow the original object-level rules, because those decisions fall under object-level reasoning.
Conversely, I think that means that it’s useful and important to look at object-level reasoning applied to the self, and where that ends up.
If you make an agent that uses a different set of rules when considering changes to itself, the agent still might create successor agents or indirectly cause it’s code to be changed, in ways that follow the original object-level rules, because those decisions fall under object-level reasoning.
Working within the ai control problem context.
The decision to allow the agent control of the object-level decisions was done on the meta-level, so it was the best option the meta-level could find. If the meta-level was stronger it would only allow object-level decider that would respects its meta-level decisions. So you are making the assumption that the object-level is stronger than the meta-level in its powers of prediction.
In the context of my work on agoric computing that is probably true. But it is not true for all meta-level decision theories.
I think the problem that the idea of doing special meta-reasoning runs into is that there’s no clear line between meta-level decisions and object-level decisions.
“Will changing this line of code be good or bad?” is the sort of question that can be analyzed on the object level even if that line of code is part of my source. If you make an agent that uses a different set of rules when considering changes to itself, the agent still might create successor agents or indirectly cause it’s code to be changed, in ways that follow the original object-level rules, because those decisions fall under object-level reasoning.
Conversely, I think that means that it’s useful and important to look at object-level reasoning applied to the self, and where that ends up.
I think this line of reasoning is the distinction between the normal computer control problem and the ai control problem.
Working within the ai control problem context.
The decision to allow the agent control of the object-level decisions was done on the meta-level, so it was the best option the meta-level could find. If the meta-level was stronger it would only allow object-level decider that would respects its meta-level decisions. So you are making the assumption that the object-level is stronger than the meta-level in its powers of prediction.
In the context of my work on agoric computing that is probably true. But it is not true for all meta-level decision theories.