How useful do you think goals in the base model are? If we’re trying to explain the behavior of observed systems, then it seems like the “basic things” are actually the goals in the explainer model. Goals in the base model might still be useful as a way to e.g. aggregate goals from multiple different explainer models, but we might also be able to do that job using objects inside the base model rather than goals that augment it.
This sort of thinking makes me want to put some more constraints on the relationship between base and explainer models.
One candidate is that the explainer model and the inferred goals should fit inside the base model. So there’s some set of perfect mappings from an arbitrary explainer model and its goals to a base model (not with goals) back to the exact same explainer model plus goals. If things are finite then this restriction actually has some teeth.
We might imagine having multiple explainer models + goals that fit to the same physical system, e.g. explaining a thermostat as “wants to control the temperature of the room” versus “wants to control the temperature of the whole house.” For each explainer model we might imagine different mappings that store it inside the base model and then re-extract it—so there’s one connection from the base model to the “wants to control the temperature of the room” explanation and another connection to the “wants to control the temperature of the whole house” explanation.
We might also want to define some kind of metric for how similar explainer models and goals are, perhaps based on what they map to in the base model.
And hopefully this metric doesn’t get too messed up if the base model undergoes ontological shift—but I think guarantees may be thin on the ground!
Anyhow, still interested, illegitimi non carborundum and all that.
How useful do you think goals in the base model are? If we’re trying to explain the behavior of observed systems, then it seems like the “basic things” are actually the goals in the explainer model. Goals in the base model might still be useful as a way to e.g. aggregate goals from multiple different explainer models, but we might also be able to do that job using objects inside the base model rather than goals that augment it.
This sort of thinking makes me want to put some more constraints on the relationship between base and explainer models.
One candidate is that the explainer model and the inferred goals should fit inside the base model. So there’s some set of perfect mappings from an arbitrary explainer model and its goals to a base model (not with goals) back to the exact same explainer model plus goals. If things are finite then this restriction actually has some teeth.
We might imagine having multiple explainer models + goals that fit to the same physical system, e.g. explaining a thermostat as “wants to control the temperature of the room” versus “wants to control the temperature of the whole house.” For each explainer model we might imagine different mappings that store it inside the base model and then re-extract it—so there’s one connection from the base model to the “wants to control the temperature of the room” explanation and another connection to the “wants to control the temperature of the whole house” explanation.
We might also want to define some kind of metric for how similar explainer models and goals are, perhaps based on what they map to in the base model.
And hopefully this metric doesn’t get too messed up if the base model undergoes ontological shift—but I think guarantees may be thin on the ground!
Anyhow, still interested, illegitimi non carborundum and all that.