All along, I suspect, people were using the “mutual information” criterion to determine whether something “has a model” of something else
This is flatly at variance with the uses of “model” I listed, drawn from OB/LW, and the way the word is defined in every book on model-based control.
No, you just asserted that people were using “model” in your sense in some posts you cited; there was nothing clear in any of the examples that implied they meant it in your sense rather than mine. And you didn’t quote from any book on model based control, and even if you did, you would still need to show how it’s not equivalent to merely having mutual information.
The only time people try to redefine “X is a model of Y” to mean “X has mutual information with Y” is when someone points out that systems of the sort that I described do not contain models.
“a simplified, abstracted representation of an object or system that presents only the information needed by its user. For example, the plastic models of aircraft I built as a kid abstract away everything except the external appearance, a mathematical model of a system shows only those dimensions and relationships useful to the model’s users,”
or
“temperature, as used by a thermostat, is a model of a system: It abstracts away all the details about the energy of individual particles in the system, except for a single scalar value representing the average of all those energies.”
So it’s clear they would count a single value that attempts to capture all critical properties of another system as a “model” of that system.
“X has mutual information with Y” is not a technical explanation of an informal concept labelled “model”. It is a completely different concept. The concept of a model, as I and everyone else outside these threads uses it, is very clear, unambiguous, and far narrower than mere mutual information.
I explained why this is false: it does not account for all the systems clearly labeled as “models” (aircraft finite element models, plastic toy models, etc.) yet only have mutual information with some phenomenon, and which the user must apply some transformation to, in order to make a prediction.
Vladimir Nesov objected to the word “correspondence” as vague; but if you want a technical elaboration of that, look in the direction of “isomorphism”, not “mutual information”.
But (as I explained before), isomorphism is not what you want here. Everyone accepts that models don’t have to be perfect representations. In contrast, “isomorphism” means a one-to-one mapping, which would indeed be a perfect model. “Mutual information” is more general than that: it includes isomorphisms, but also cases where the best mapping isn’t always correct, and where the model doesn’t include all aspects of the phenomenon.
And I don’t think this is just an issue of arguing definitions. There’s a broader issue about whether you can helpfully carve conceptspace in a way that captures Richard’s definition of “model” but excludes things that “merely” have mutual information.
Well, you have my answer to that. Conceptspace is carved along one line called “model”, and along another line called “mutual information”.
Er, that’s not how carving conceptspace works. The task of helpfully carving conceptspace is to show how your cuts don’t split things with significant relevant similarities. I claim you do so when you say a model “must make predictions”. This would count a computer model of an aircraft as “not a model”.
You’re missing the point of the problem when you say what you did here.
Both lines matter, both have their uses, and they are in very different places. You want to erase the former or move it to coincide with the latter, but I have seen no argument for doing this.
No, what I’m saying is that to be a model, something must have (nontrivial) mutual information with some other phenomenon. But “model” is most often used to connote a case where some human, with whom you can debate, will apply the necessary interpretation to the physical instantiation of model so as to tell you what its prediction is.
Still, something “has a model” whether or not some human is actually applying the necessary interpretation. The domino computer I linked contains a model of binary addition, even before someone realizes it. A computer’s hardware can have a model of an aircraft, even if someone throws it in the trash. In fact, the whole field of computation is basically identifying which physical systems already contain models of some kind of computation, and which we can therefore rely on, given some interpretation, to consistently give us the correct answer.
I do not find it helpful to say, “this thing over here explicitly outputs a prediction, so it’s a model, but this thing over here is just entangled with the phenomenon, so it doesn’t have a model”. Both are models, and the problem is on our end in the inability to harness the correlation to make what we consider a prediction.
When I’ve looked for information-theoretic or Bayesian analyses of control, I have found nothing substantial. Of course, I’m aware of the use of Bayesian techniques within control theory, such as Kalman filters. This is asking for the reverse inclusion. That is the substantial issue here.
Sorry, I don’t see it. The only problem is your arbitrary distinction between model-based controllers vs. non-model based, when really, both are model-based. As I said when I rephrased your claim, the substantive issue is how much of a given system needs to be modeled, and I already accept your claim that a model needn’t include everything about its environment, and that further, people typically overestimate how much must be modeled.
That is what we are really talking about, and I already agree with you there. All that remains is your arbitrary re-assignment of some things as “models” and others not, which is fruitless.
No, you just asserted that people were using “model” in your sense in some posts you cited; there was nothing clear in any of the examples that implied they meant it in your sense rather than mine. And you didn’t quote from any book on model based control, and even if you did, you would still need to show how it’s not equivalent to merely having mutual information.
With respect to the links I provided to earlier postings on OB/LW I shall only say that I have reviewed them and stand by the characterisation I made of them at the time (which went beyond mere assertion that they agree with me). To amplify my claim regarding books on model-based control theory, the following notes are drawn from the books I have to hand which include an easily identified statement of what the authors mean by a model. All of them are talking about a system that is specifically similar in structure to and not merely entangled with the thing modelled. At this point I think it is up to you to show that these things are equivalent. As I said at the end of my last comment, this would be a highly non-trivial task, a complete reconstruction of the content of books such as these. (It is too large to do in the columns of Less Wrong, but I look forward to reading it, whoever writes it.)
1. Brosilow & Joseph “Techniques of Model-Based Control”
Page 10, Figure 1.6, “Generic form of the model-based control strategy.” This is a block diagram in which one block is labelled “Process”, and another “Model”; the Model is a subsystem of the control system, designed to have the same input-output behaviour as the Process which the control system is to control. Ding!
2. Marlin, “Process Control”. Page 584, section 19.2, “The Model Predictive Control Structure”.
Here the author introduces the eponymous control method, in which a model of the process to be controlled is constructed and used to predict its future behaviour, in order to overcome the problem that (in the motivating example) the process contains substantial transport lags (a common situation in process control). The model is, as in the previous reference, a mathematical scheme designed to have the same input-output-relation as the real process, and is used by the controller to predict the future values of some of the variables. Ding!
3. Goodwin, Graebe, and Salgado, “Control System Design”.
Pages 29-30, section 2.5: (paraphrased slightly) “Let us also assume that the output is related to the input by a known functional relationship of the form y = f(u)+d, where f is a transformation that describes the input-output relations in the plant. We call a relationship of this type a model.” Ding!
Another block diagram as in Brosilow & Joseph. Ding!
5. Leigh, “Control Theory” (2nd. ed.)
Chapter 6, “Mathematical modelling”.
Sorry, no nuggets to quote, you’ll have to read it yourself. But it’s a whole chapter about models in the above sense. This, in fact, is a book I’d recommend as an introduction to control theory in general, which is why I mention it, despite it not lending itself to concise quotation. Ding!
No, you just asserted that people were using “model” in your sense in some posts you cited; there was nothing clear in any of the examples that implied they meant it in your sense rather than mine. And you didn’t quote from any book on model based control, and even if you did, you would still need to show how it’s not equivalent to merely having mutual information.
No, as others pointed out, they normally use “model” to mean e.g.
or
So it’s clear they would count a single value that attempts to capture all critical properties of another system as a “model” of that system.
I explained why this is false: it does not account for all the systems clearly labeled as “models” (aircraft finite element models, plastic toy models, etc.) yet only have mutual information with some phenomenon, and which the user must apply some transformation to, in order to make a prediction.
But (as I explained before), isomorphism is not what you want here. Everyone accepts that models don’t have to be perfect representations. In contrast, “isomorphism” means a one-to-one mapping, which would indeed be a perfect model. “Mutual information” is more general than that: it includes isomorphisms, but also cases where the best mapping isn’t always correct, and where the model doesn’t include all aspects of the phenomenon.
Er, that’s not how carving conceptspace works. The task of helpfully carving conceptspace is to show how your cuts don’t split things with significant relevant similarities. I claim you do so when you say a model “must make predictions”. This would count a computer model of an aircraft as “not a model”.
You’re missing the point of the problem when you say what you did here.
No, what I’m saying is that to be a model, something must have (nontrivial) mutual information with some other phenomenon. But “model” is most often used to connote a case where some human, with whom you can debate, will apply the necessary interpretation to the physical instantiation of model so as to tell you what its prediction is.
Still, something “has a model” whether or not some human is actually applying the necessary interpretation. The domino computer I linked contains a model of binary addition, even before someone realizes it. A computer’s hardware can have a model of an aircraft, even if someone throws it in the trash. In fact, the whole field of computation is basically identifying which physical systems already contain models of some kind of computation, and which we can therefore rely on, given some interpretation, to consistently give us the correct answer.
I do not find it helpful to say, “this thing over here explicitly outputs a prediction, so it’s a model, but this thing over here is just entangled with the phenomenon, so it doesn’t have a model”. Both are models, and the problem is on our end in the inability to harness the correlation to make what we consider a prediction.
Sorry, I don’t see it. The only problem is your arbitrary distinction between model-based controllers vs. non-model based, when really, both are model-based. As I said when I rephrased your claim, the substantive issue is how much of a given system needs to be modeled, and I already accept your claim that a model needn’t include everything about its environment, and that further, people typically overestimate how much must be modeled.
That is what we are really talking about, and I already agree with you there. All that remains is your arbitrary re-assignment of some things as “models” and others not, which is fruitless.
With respect to the links I provided to earlier postings on OB/LW I shall only say that I have reviewed them and stand by the characterisation I made of them at the time (which went beyond mere assertion that they agree with me). To amplify my claim regarding books on model-based control theory, the following notes are drawn from the books I have to hand which include an easily identified statement of what the authors mean by a model. All of them are talking about a system that is specifically similar in structure to and not merely entangled with the thing modelled. At this point I think it is up to you to show that these things are equivalent. As I said at the end of my last comment, this would be a highly non-trivial task, a complete reconstruction of the content of books such as these. (It is too large to do in the columns of Less Wrong, but I look forward to reading it, whoever writes it.)
1. Brosilow & Joseph “Techniques of Model-Based Control”
Page 10, Figure 1.6, “Generic form of the model-based control strategy.” This is a block diagram in which one block is labelled “Process”, and another “Model”; the Model is a subsystem of the control system, designed to have the same input-output behaviour as the Process which the control system is to control. Ding!
2. Marlin, “Process Control”. Page 584, section 19.2, “The Model Predictive Control Structure”.
Here the author introduces the eponymous control method, in which a model of the process to be controlled is constructed and used to predict its future behaviour, in order to overcome the problem that (in the motivating example) the process contains substantial transport lags (a common situation in process control). The model is, as in the previous reference, a mathematical scheme designed to have the same input-output-relation as the real process, and is used by the controller to predict the future values of some of the variables. Ding!
3. Goodwin, Graebe, and Salgado, “Control System Design”.
Pages 29-30, section 2.5: (paraphrased slightly) “Let us also assume that the output is related to the input by a known functional relationship of the form y = f(u)+d, where f is a transformation that describes the input-output relations in the plant. We call a relationship of this type a model.” Ding!
4. Astrom and Wittenmark, “Adaptive Control”
Page 20, Chapter 1, “Model-Reference Adaptive Systems”
Another block diagram as in Brosilow & Joseph. Ding!
5. Leigh, “Control Theory” (2nd. ed.)
Chapter 6, “Mathematical modelling”.
Sorry, no nuggets to quote, you’ll have to read it yourself. But it’s a whole chapter about models in the above sense. This, in fact, is a book I’d recommend as an introduction to control theory in general, which is why I mention it, despite it not lending itself to concise quotation. Ding!
Ding! Ding! Ding! Ding! Ding!