(Stumbled across this old thread, let me know if you’ve learned anything since then.)
(I do not have professional knowledge of control engineering.)
You cited the claim “a controller benefits from a model of the item it controls”, and then you wrote “The third reference…contains no claim that a model is an essential part of a control system”. Those are different, right?
For my part, I don’t think it’s essential. I do think it’s helpful. Incidentally, I note that you can find other places where Graziano has made the stronger claim that it’s essential, and when he does make that claim, I think Graziano is wrong.
Why do I think it’s helpful? Well, if you have a generative model, you can do MPC.
Then maybe you’ll respond: Fine, if I don’t have a generative model, I’ll just do something else instead, it’s not like MPC is the only game in town.
But a nice thing about MPC is that you can update a generative model from self-supervised (predictive) learning (unlike a policy). I.e., when you make a wrong prediction, you get high-dimensional data about what you should have predicted instead, and thus how to improve the model. (You get a full error gradient “for free” with each query.) And you can use off-policy observations to update the model. And you can have a ridiculously complicated open-ended space for what the generative model might look like, and still converge on a good model fast because of the rich data from self-supervised learning.
Those kinds of considerations make me think that putting a generative model inside the controller is at least plausibly helpful.
You cited the claim “a controller benefits from a model of the item it controls”, and then you wrote “The third reference…contains no claim that a model is an essential part of a control system”. Those are different, right?
The cited claim was additionally that benefitting from a model is “a fundamental principle of control engineering”, and Graziano places no limitation on what sort of controller he was referring to. I don’t think there’s a substantial gap between that and my “essential”.
Model the physics of how a room warms and cools, and that may tell you where the best place to site the thermostat is, and how powerful a heat source you need, but I do not know any way in which the thermostat might control better by itself containing any model.
I remarked on a problem with learning a model of the plant being controlled in another comment: because the controller is controlling the plant, the full dynamics of the plant alone cannot be observed. I don’t know the current state of theory and practice on this issue.
So, yes, you can design a controller to contain a model, but my claim is that it is not so fundamental an idea.
Graziano (as far as appears from the extract, and I do not feel motivated to consult the full paper) uses it to justify the idea that we model our own minds and other people’s, something which seems clear from our own experience without depending on control theory.
I’m definitely not defending Graziano, as mentioned.
I do not know any way in which the thermostat might control better by itself containing any model
Let’s say our thermostat had a giant supercomputer cluster and a 1-frame-per-minute camera inside it.
We use self-supervised learning to learn a mapping:
(temperature history (including now), heater setting history (including now), video history (including now)) ↦ (next temperature, next video frame).
This mapping is our generative model, right? And it could grow very sophisticated. Like it could learn that when cocktail glasses appear in the camera frame, then a party is going to start soon, and the people are going to heat up the room in the near future, so we should keep the room on the cooler side right now to compensate.
Then the thermometer can do MPC, i.e. run through lots of future probabilistic rollouts of the next hour with different possible heater settings, and find the rollout where the temperature is most steady—plus some randomness as discussed next:
because the controller is controlling the plant, the full dynamics of the plant alone cannot be observed
That’s just explore-versus-exploit, right? You definitely don’t want to always exactly follow the trajectory that is predicted to be optimal. You want to occasionally do other things (a.k.a. explore) to make sure your model is actually correct. I guess some kind of multi-armed bandit algorithm thing?
Let’s say our thermostat had a giant supercomputer cluster and a 1-frame-per-minute camera inside it.
This sounds like a product of Sirius Cybernetics Corporation. “It is very easy to be blinded to the essential uselessness of them by the sense of achievement you get from getting them to work at all.”
All you need is a bimetallic strip and a pair of contacts to sense “too high” and “too low”.
Like it could learn that when cocktail glasses appear in the camera frame, then a party is going to start soon, and the people are going to heat up the room in the near future, so we should keep the room on the cooler side right now to compensate.
In other words, control worse now, in order to...what?
In other words, control worse now, in order to...what?
Suppose the loss is mean-square deviation from the set point. Suppose there’s going to be a giant uncontrollable exogenous heat source soon (crowded party), and suppose there is no cooling system (the thermostat is hooked up to a heater but there is no AC).
Then we’re expecting a huge contribution to the loss function from an upcoming positive temperature deviation. And there’s nothing much the system can do about it once the party is going, other than obviously not turn on the heat and make it even worse.
But supposing the system knows this is going to happen, it can keep the room a bit too cool before the party starts. That also incurs a loss, of course. But the way mean-square-loss works is that we come out ahead on average.
Like, if the deviation is 0° now and then +10° midway through the party, that’s higher-loss than −2° now and +8° midway through the party, again assuming loss = mean-square-deviation. 0²+10² > 2²+8², right?
This sounds like a product of Sirius Cybernetics Corporation. “It is very easy to be blinded to the essential uselessness of them by the sense of achievement you get from getting them to work at all.”
All you need is a bimetallic strip and a pair of contacts to sense “too high” and “too low”.
Well jeez, I’m not proposing that we actually do this! I thought the “giant supercomputer cluster” was a dead giveaway.
If you want a realistic example, I do think the brain uses generative modeling / MPC as part of its homeostatic / allostatic control systems (and motor control and so on). I think there are good reasons that the brain does it that way, and that alternative model-free designs would not work as well (although they would work more than zero).
(Stumbled across this old thread, let me know if you’ve learned anything since then.)
(I do not have professional knowledge of control engineering.)
You cited the claim “a controller benefits from a model of the item it controls”, and then you wrote “The third reference…contains no claim that a model is an essential part of a control system”. Those are different, right?
For my part, I don’t think it’s essential. I do think it’s helpful. Incidentally, I note that you can find other places where Graziano has made the stronger claim that it’s essential, and when he does make that claim, I think Graziano is wrong.
Why do I think it’s helpful? Well, if you have a generative model, you can do MPC.
Then maybe you’ll respond: Fine, if I don’t have a generative model, I’ll just do something else instead, it’s not like MPC is the only game in town.
But a nice thing about MPC is that you can update a generative model from self-supervised (predictive) learning (unlike a policy). I.e., when you make a wrong prediction, you get high-dimensional data about what you should have predicted instead, and thus how to improve the model. (You get a full error gradient “for free” with each query.) And you can use off-policy observations to update the model. And you can have a ridiculously complicated open-ended space for what the generative model might look like, and still converge on a good model fast because of the rich data from self-supervised learning.
Those kinds of considerations make me think that putting a generative model inside the controller is at least plausibly helpful.
The cited claim was additionally that benefitting from a model is “a fundamental principle of control engineering”, and Graziano places no limitation on what sort of controller he was referring to. I don’t think there’s a substantial gap between that and my “essential”.
Model the physics of how a room warms and cools, and that may tell you where the best place to site the thermostat is, and how powerful a heat source you need, but I do not know any way in which the thermostat might control better by itself containing any model.
I remarked on a problem with learning a model of the plant being controlled in another comment: because the controller is controlling the plant, the full dynamics of the plant alone cannot be observed. I don’t know the current state of theory and practice on this issue.
So, yes, you can design a controller to contain a model, but my claim is that it is not so fundamental an idea.
Graziano (as far as appears from the extract, and I do not feel motivated to consult the full paper) uses it to justify the idea that we model our own minds and other people’s, something which seems clear from our own experience without depending on control theory.
I’m definitely not defending Graziano, as mentioned.
Let’s say our thermostat had a giant supercomputer cluster and a 1-frame-per-minute camera inside it.
We use self-supervised learning to learn a mapping:
(temperature history (including now), heater setting history (including now), video history (including now)) ↦ (next temperature, next video frame).
This mapping is our generative model, right? And it could grow very sophisticated. Like it could learn that when cocktail glasses appear in the camera frame, then a party is going to start soon, and the people are going to heat up the room in the near future, so we should keep the room on the cooler side right now to compensate.
Then the thermometer can do MPC, i.e. run through lots of future probabilistic rollouts of the next hour with different possible heater settings, and find the rollout where the temperature is most steady—plus some randomness as discussed next:
That’s just explore-versus-exploit, right? You definitely don’t want to always exactly follow the trajectory that is predicted to be optimal. You want to occasionally do other things (a.k.a. explore) to make sure your model is actually correct. I guess some kind of multi-armed bandit algorithm thing?
This sounds like a product of Sirius Cybernetics Corporation. “It is very easy to be blinded to the essential uselessness of them by the sense of achievement you get from getting them to work at all.”
All you need is a bimetallic strip and a pair of contacts to sense “too high” and “too low”.
In other words, control worse now, in order to...what?
Suppose the loss is mean-square deviation from the set point. Suppose there’s going to be a giant uncontrollable exogenous heat source soon (crowded party), and suppose there is no cooling system (the thermostat is hooked up to a heater but there is no AC).
Then we’re expecting a huge contribution to the loss function from an upcoming positive temperature deviation. And there’s nothing much the system can do about it once the party is going, other than obviously not turn on the heat and make it even worse.
But supposing the system knows this is going to happen, it can keep the room a bit too cool before the party starts. That also incurs a loss, of course. But the way mean-square-loss works is that we come out ahead on average.
Like, if the deviation is 0° now and then +10° midway through the party, that’s higher-loss than −2° now and +8° midway through the party, again assuming loss = mean-square-deviation. 0²+10² > 2²+8², right?
Well jeez, I’m not proposing that we actually do this! I thought the “giant supercomputer cluster” was a dead giveaway.
If you want a realistic example, I do think the brain uses generative modeling / MPC as part of its homeostatic / allostatic control systems (and motor control and so on). I think there are good reasons that the brain does it that way, and that alternative model-free designs would not work as well (although they would work more than zero).