Don’t you think that the control-theoretic toolbox becomes a very incomplete perspective for analysing and predicting the behaviour of systems exactly when we move from complicated, but manageable cybernetic systems (which are designed and engineered) to messy human brains, deep RL, or Transformers, which are grown and trained?
I think you misunderstood what I was saying. Suppose that some part of the brain actually literally involves feedback control. For example (more details here):
Fat cells emit leptin. The more fat cells there are, the more leptin winds up in your bloodstream.
There are neurons in the hypothalamus called “NPY/AgRP neurons”, which have access to the bloodstream via a break in the blood-brain barrier. When leptin binds to these neurons, it makes them express less mRNA for NPY & AgRP, and also makes them fire less.
The less that NPY/AgRP neurons fire, and the less AgRP and NPY that they release, the less that an animal will feel inclined to eat, via various pathways in the brain that we don’t have to get into.
The less an animal eats, the less fat cells it will have in the future.
This is straightforwardly a feedback control system that (under normal circumstances) helps maintain an animal’s adiposity. Right?
All I’m saying here is, if we look in the brain and we see something like this, something which is actually literally mechanistically a feedback control system, then that’s what we should call it, and that’s how we should think about it.
I am definitely not endorsing “control theory” as a grand unified theory of the brain. I am generally opposed to grand unified theories of the brain in the first place. I think if part of the brain is configured to run an RL algorithm, we should say it’s doing RL, and if a different part of the brain is configured as a feedback control system, we should say it’s a feedback control system, and yes, if the thalamocortical system is configured to do the specific set of calculations that comprise approximate Bayesian inference, then we should say that too!
By the same token, I’m generally opposed to grand unified theories of the body. The shoulder involves a ball-and-socket joint, and the kidney filters blood. OK cool, those are two important facts about the body. I’m happy to know them! I don’t feel the need for a grand unified theory of the body that includes both ball-and-socket joints and blood filtration as two pieces of a single grand narrative. Likewise, I don’t feel the need for a grand unified theory of the brain that includes both the cerebellum and the superior colliculus as two pieces of a single grand narrative. The cerebellum and the superior colliculus are doing two very different sets of calculations, in two very different ways, for two very different reasons! They’re not quite as different as the shoulder joint versus the kidney, but same idea.
By the same token, I’m generally opposed to grand unified theories of the body. The shoulder involves a ball-and-socket joint, and the kidney filters blood. OK cool, those are two important facts about the body. I’m happy to know them! I don’t feel the need for a grand unified theory of the body that includes both ball-and-socket joints and blood filtration as two pieces of a single grand narrative.
I think I am generally on board with you on your critiques of FEP, but I disagree with this framing against grand unified theories. The shoulder and the kidney are both made of cells. They both contain DNA which is translated into proteins. They are both are designed by an evolutionary process.
Grand unified theories exist, and they are precious. I want to eke out every sliver of generality wherever I can. Grand unified theories are also extremely rare, and far more common in the public discourse are fakes that create an illusion of generality without making any substantial connections. The style of thinking that looks at a ball-and-socket joint and a blood filtration system and immediately thinks “I need to find how these are really the same” rather than studying these two things in detail and separately is apt to create these false grand unifications, and altho I haven’t looked into FEP as deeply as you or other commenters the writing I have seen on it smells more like this mistake than true generality.
But a big reason I care about exposing these false theories and the bad mental habits that are conducive to them is precisely because I care so much about true grand unified theories. I want grand unified theories to shine like beacons so we can notice their slightest nudge, and feel the faint glimmer of a new one when it is approaching from the distance, rather than be hidden by a cacophony of overblown rhetoric coming from random directions.
Methodologically, opposing “grand unifying theories” of the brain (or AI architecture) doesn’t make sense to me. Of course we want grand unifying theories of these things, because we don’t deal with generator and discriminator separately, we deal with GAN as a whole (or LeCun’s H-JEPA architecture as a whole, to take a more realistic example of an agent; it has many distinct DNN components, performing inference and being trained separately and at different timescales) because we are going to deal with the system as a whole, not its components. We want to predict the behaviour of the system as a whole.
If you analyse and predict components separately, in order to predict the system as a whole, you effectively “sort of” executing a “grand unifying theory” in your own head. It’s much less reliable than if the theory is formalised, tested, simulated, etc. explicitly.
Now, the reason for why I hold both Active Inference and control theory as valid perspectives, is because neither of them is a “real” unifying model of a messy system, for various reasons. So, in effect, I suggest that taking both these perspectives is useful because if their predictions diverge in some situations, we should make a judgement call of what prediction we “trust” more, which would be the execution of our “grand unifying theory” (there are also other approaches for triaging such a situation, for example, apply the precautionary principle and trust the more pessimistic prediction).
So, if we could construct a unifying theory that would easily recover both control theory (I mean not the standard KL control, which Active Inference recovers, but more SoTA developments of control theory, such as those discussed by Li et al. (2021)) and Active Inference, that would be great. Or, it might be shown that one recovers another (I doubt it is possible, but who knows).
Finally, there is a question of whether unifying algorithms is “sound” or it’s too “artificial, forced”. And your claim is that Active Inference is a “forced” unification of perception and planning/action algorithms. To this claim, the first reply is that even if you dislike Active Inference unification, you would better provide alternative unification, because otherwise you just do this unification in your head, which is not optimal. And second, in case of specifically perception and action, they should be considered together, as Anil Seth explains in “Being You” more eloquently than I can (I cited in another comment.)
The thing I’m advocating for is: if a system has seven different interconnected components, we should think of it as a system with seven different interconnected components, and we can have nice conversations about what those components are and how they work together and interact, and what is the system going to do at the end of the day in different situations.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”? Beats me, I don’t know what that means. You give examples like how a GAN has two DNN components but is also “a unified system”, and LeCun’s H-JEPA has “many distinct DNN components” but is also “a unified system”, if I understand you correctly. OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Now, if the brain in fact has a number of different interconnected components doing different things, then we can still model it as a simpler system with fewer components, if we want to. The model will presumably capture some aspects / situations well, and others poorly—as all models do.
If that’s how we’re thinking about Active Inference—an imperfect model of a more complicated system—then I would expect to see people say things like the following:
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F” [and here’s how I know that…]
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”?
“Being” a unified system has realist connotations. Let’s stay on the instrumentalist side, at least for now. I want to analyse the brain (or, the whole human, really) as a unified system, and assume an intentional stance towards it, and build its theory of mind, all because I want to predict its behaivour as a whole.
OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Yes, it can and should be analysed as a unified system, to the extent possible and reasonable (not falling into over-extending theories by concocting components in an artificial, forced way).
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F”
Active Inference is an illuminating and misleading model of Homo Sapiens simultaneously, in a similar way as Newtonian physics is both an illuminating and misleading theory of planetary motion. I explained why it’s misleading: it doesn’t account for intrinsic contextuality in inference and decision-making (see my other top-level comment on this). The austere instrumentalist response to it would be “OK, let’s, as scientists, model the same organism as two separate Active Inference models at the same time, and decide ‘which one acts at the moment’ based on the context”. While this approach could also be illuminating and instrumentally useful, I agree it beings to be very slippery.
Also, Active Inference is too simple/general (I suspect, but not sure) to recover some more sophisticated distributed control characteristics of the human brain (or, possibly, AI systems, for that matter), as I discussed above.
Active Inference is not a “very misleading” (“degree of misleadingness” is not a precise thing, but I hope you get it) of any “normal” agents we can think of (well, barring some extreme instances of quantumness/contextuality, a-la agentic quantum AI, but let’s not go there) precisely because it’s very generic, as I discuss in this comment. By this token, maximum entropy RL would be a misleading model of an agent hinging on the boundary between the scale where random fluctuations do and don’t make a difference (this should be a scale smaller than cellular, because probably random (quantum) fluctuations already don’t make difference on the cellular scale; because the scale is so small, it’s also hard to construct anywhere “smart” agent on such a small scale).
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
See above, pretty much. Active Inference is not a good model for decision making under cognitive dissonance, i.e., intrinsic contextuality (Fields typically cites Dzhafarov for psychological evidence of this).
Note that intrinsic contextuality is not a problem for any theory of agency, it’s more endemic to Active Inference than (some) other theories. E.g., connectionist models where NNs are first-class citizens of the theory are immune to this problem. See Marciano et al. (2022) for some work in this direction.
Also, Active Inference is probably not very illuminating model in discussing access consciousness and memory (some authors, e.g., Whyte & Smith (2020), claim that it is illuminating, at least partially, but it definitely not the whole story). (Phenomenal consciousness is more debatable, and I tend to think that Active Inference is a rather illuminating model for this phenomenon.)
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
I know too little about brain regions, neurobiological mechanisms, and the like, but I guess it’s possible (maybe not today, but maybe today, I really don’t know enough to judge, but you can tell better; also, these may have already been found, but I don’t know the literature) to find these mechanisms and brain regions that play the role during “context switches”, or in ensuring long-range distributed control, as discussing above. Or access consciousness and memory.
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
I hope from the exposition above it should be clear that you couldn’t quite factor Active Inference into a subsystem of the brain/mind (unless under “multiple Active Inference models with context switches” model of the mind, but, as I noted above, I thing this would be a rather iffy model to begin with). I would rather say: Active Inference still as a “framework” model with certain “extra Act Inf” pieces (such as access consciousness and memory) “attached” to it, plus other models (distributed control, and maybe there are some other models, I didn’t think about it deeply) that don’t quite cohere with Active Inference altogether and thus we сan only resort to modelling the brain/mind “as either one or another”, getting predictions, and comparing them.
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
Ok, I did that. My own understanding (and appreciation of limitations) of Active Inference progressed significantly, and changed considerably in a matter of the last several weeks alone, so most of my earlier blog posts could be significantly wrong or misleading in important aspects (and, I think, even my current writing is significantly wrong and misleading, because this process of re-appreciation of Active Inference is not settled for me yet).
I would also encourage you to read about thermodynamic machine learning and MCR^2 theories. (I want to do this myself, but still haven’t.)
It’s cool that you’re treating Active Inference as a specific model that might or might not apply to particular situations, organisms, brain regions, etc. In fact, that arguably puts you outside the group of people / papers that this blog post is even criticizing in the first place—see Section 0.
A thing that puzzles me, though, is your negative reactions to Sections 3 & 4. From this thread, it seems to me that your reaction to Section 3 should have been:
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then obviously this is a feedback control system. So anyone who uses Active Inference language to talk about this system, like by saying that it’s ‘predicting’ that the room temperature will stay constant, is off their rocker! And… EITHER …that position is a straw-man, nobody actually says things like that! OR …people do say that, and I join you in criticizing them!”
And similarly for Section 4, for a system that is actually, mechanistically, straightforwardly based on an RL algorithm.
But that wasn’t your reaction, right? Why not? Was it just because you misunderstood my post? Or what’s going on?
I thought your post is an explanation of why you don’t find Active Inference a useful theory/model, rather than criticism of people. I mean, it sort of criticises authors of the papers on FEP for various reasons, but who cares? I care whether the model is useful or not, not whether people who proposed the theory were clear in their earlier writing (as long as you are able to arrive at the actual understanding of the theory). I didn’t see this as a central argument.
So, my original reaction to 3 (the root comment in this thread) was about the usefulness of the theory (vs control theory), not about people.
Re: 4, I already replied that I misunderstood your “mechanistical lizard” assumption. So only the first part of my original reply to 4 (about ontology and conceptualisation, but also about interpretability, communication, hierarchical composability, which I didn’t mention originally, but that is discussed at length in “Designing Ecosystems of Intelligence from First Principles” (Friston et al., Dec 2022)). Again, these are arguments about the usefulness of the model, not about criticising people.
Sorry, I’ll rephrase. I expect you to agree with the following; do you?
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then this particular system is a feedback control system. And the most useful way to model it and to think about it is as a feedback control system. It would be unhelpful (or maybe downright incorrect?) to call this particular system an Active Inference system, and to say that it’s ‘predicting’ that the room temperature will stay constant.”
“Downright incorrect”—no, because Active Inference model would be simply a mathematical generalisation of (simple) feedback control model in a thermostat. The implication “thermostat is a feedback control system” → “thermostat is an Active Inference agent” has the same “truth property” (sorry, I don’t know the correct term for this in logic) as the implication “A is a group” → “A is a semigroup”. Just a strict mathematical model generalisation.
“and to say that it’s ‘predicting’ that the room temperature will stay constant.”—no, it doesn’t predict predict specifically that “temperature will stay constant”. It predicts (or, “has preference for”) a distribution of temperature states of the room. And tries to act so as the actual distribution of these room temperatures matches this predicted distribution.
I think you misunderstood what I was saying. Suppose that some part of the brain actually literally involves feedback control. For example (more details here):
Fat cells emit leptin. The more fat cells there are, the more leptin winds up in your bloodstream.
There are neurons in the hypothalamus called “NPY/AgRP neurons”, which have access to the bloodstream via a break in the blood-brain barrier. When leptin binds to these neurons, it makes them express less mRNA for NPY & AgRP, and also makes them fire less.
The less that NPY/AgRP neurons fire, and the less AgRP and NPY that they release, the less that an animal will feel inclined to eat, via various pathways in the brain that we don’t have to get into.
The less an animal eats, the less fat cells it will have in the future.
This is straightforwardly a feedback control system that (under normal circumstances) helps maintain an animal’s adiposity. Right?
All I’m saying here is, if we look in the brain and we see something like this, something which is actually literally mechanistically a feedback control system, then that’s what we should call it, and that’s how we should think about it.
I am definitely not endorsing “control theory” as a grand unified theory of the brain. I am generally opposed to grand unified theories of the brain in the first place. I think if part of the brain is configured to run an RL algorithm, we should say it’s doing RL, and if a different part of the brain is configured as a feedback control system, we should say it’s a feedback control system, and yes, if the thalamocortical system is configured to do the specific set of calculations that comprise approximate Bayesian inference, then we should say that too!
By the same token, I’m generally opposed to grand unified theories of the body. The shoulder involves a ball-and-socket joint, and the kidney filters blood. OK cool, those are two important facts about the body. I’m happy to know them! I don’t feel the need for a grand unified theory of the body that includes both ball-and-socket joints and blood filtration as two pieces of a single grand narrative. Likewise, I don’t feel the need for a grand unified theory of the brain that includes both the cerebellum and the superior colliculus as two pieces of a single grand narrative. The cerebellum and the superior colliculus are doing two very different sets of calculations, in two very different ways, for two very different reasons! They’re not quite as different as the shoulder joint versus the kidney, but same idea.
I think I am generally on board with you on your critiques of FEP, but I disagree with this framing against grand unified theories. The shoulder and the kidney are both made of cells. They both contain DNA which is translated into proteins. They are both are designed by an evolutionary process.
Grand unified theories exist, and they are precious. I want to eke out every sliver of generality wherever I can. Grand unified theories are also extremely rare, and far more common in the public discourse are fakes that create an illusion of generality without making any substantial connections. The style of thinking that looks at a ball-and-socket joint and a blood filtration system and immediately thinks “I need to find how these are really the same” rather than studying these two things in detail and separately is apt to create these false grand unifications, and altho I haven’t looked into FEP as deeply as you or other commenters the writing I have seen on it smells more like this mistake than true generality.
But a big reason I care about exposing these false theories and the bad mental habits that are conducive to them is precisely because I care so much about true grand unified theories. I want grand unified theories to shine like beacons so we can notice their slightest nudge, and feel the faint glimmer of a new one when it is approaching from the distance, rather than be hidden by a cacophony of overblown rhetoric coming from random directions.
Methodologically, opposing “grand unifying theories” of the brain (or AI architecture) doesn’t make sense to me. Of course we want grand unifying theories of these things, because we don’t deal with generator and discriminator separately, we deal with GAN as a whole (or LeCun’s H-JEPA architecture as a whole, to take a more realistic example of an agent; it has many distinct DNN components, performing inference and being trained separately and at different timescales) because we are going to deal with the system as a whole, not its components. We want to predict the behaviour of the system as a whole.
If you analyse and predict components separately, in order to predict the system as a whole, you effectively “sort of” executing a “grand unifying theory” in your own head. It’s much less reliable than if the theory is formalised, tested, simulated, etc. explicitly.
Now, the reason for why I hold both Active Inference and control theory as valid perspectives, is because neither of them is a “real” unifying model of a messy system, for various reasons. So, in effect, I suggest that taking both these perspectives is useful because if their predictions diverge in some situations, we should make a judgement call of what prediction we “trust” more, which would be the execution of our “grand unifying theory” (there are also other approaches for triaging such a situation, for example, apply the precautionary principle and trust the more pessimistic prediction).
So, if we could construct a unifying theory that would easily recover both control theory (I mean not the standard KL control, which Active Inference recovers, but more SoTA developments of control theory, such as those discussed by Li et al. (2021)) and Active Inference, that would be great. Or, it might be shown that one recovers another (I doubt it is possible, but who knows).
Finally, there is a question of whether unifying algorithms is “sound” or it’s too “artificial, forced”. And your claim is that Active Inference is a “forced” unification of perception and planning/action algorithms. To this claim, the first reply is that even if you dislike Active Inference unification, you would better provide alternative unification, because otherwise you just do this unification in your head, which is not optimal. And second, in case of specifically perception and action, they should be considered together, as Anil Seth explains in “Being You” more eloquently than I can (I cited in another comment.)
I’m pretty confused here.
The thing I’m advocating for is: if a system has seven different interconnected components, we should think of it as a system with seven different interconnected components, and we can have nice conversations about what those components are and how they work together and interact, and what is the system going to do at the end of the day in different situations.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”? Beats me, I don’t know what that means. You give examples like how a GAN has two DNN components but is also “a unified system”, and LeCun’s H-JEPA has “many distinct DNN components” but is also “a unified system”, if I understand you correctly. OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Now, if the brain in fact has a number of different interconnected components doing different things, then we can still model it as a simpler system with fewer components, if we want to. The model will presumably capture some aspects / situations well, and others poorly—as all models do.
If that’s how we’re thinking about Active Inference—an imperfect model of a more complicated system—then I would expect to see people say things like the following:
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F” [and here’s how I know that…]
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
“Being” a unified system has realist connotations. Let’s stay on the instrumentalist side, at least for now. I want to analyse the brain (or, the whole human, really) as a unified system, and assume an intentional stance towards it, and build its theory of mind, all because I want to predict its behaivour as a whole.
Yes, it can and should be analysed as a unified system, to the extent possible and reasonable (not falling into over-extending theories by concocting components in an artificial, forced way).
Active Inference is an illuminating and misleading model of Homo Sapiens simultaneously, in a similar way as Newtonian physics is both an illuminating and misleading theory of planetary motion. I explained why it’s misleading: it doesn’t account for intrinsic contextuality in inference and decision-making (see my other top-level comment on this). The austere instrumentalist response to it would be “OK, let’s, as scientists, model the same organism as two separate Active Inference models at the same time, and decide ‘which one acts at the moment’ based on the context”. While this approach could also be illuminating and instrumentally useful, I agree it beings to be very slippery.
Also, Active Inference is too simple/general (I suspect, but not sure) to recover some more sophisticated distributed control characteristics of the human brain (or, possibly, AI systems, for that matter), as I discussed above.
Active Inference is not a “very misleading” (“degree of misleadingness” is not a precise thing, but I hope you get it) of any “normal” agents we can think of (well, barring some extreme instances of quantumness/contextuality, a-la agentic quantum AI, but let’s not go there) precisely because it’s very generic, as I discuss in this comment. By this token, maximum entropy RL would be a misleading model of an agent hinging on the boundary between the scale where random fluctuations do and don’t make a difference (this should be a scale smaller than cellular, because probably random (quantum) fluctuations already don’t make difference on the cellular scale; because the scale is so small, it’s also hard to construct anywhere “smart” agent on such a small scale).
See above, pretty much. Active Inference is not a good model for decision making under cognitive dissonance, i.e., intrinsic contextuality (Fields typically cites Dzhafarov for psychological evidence of this).
Note that intrinsic contextuality is not a problem for any theory of agency, it’s more endemic to Active Inference than (some) other theories. E.g., connectionist models where NNs are first-class citizens of the theory are immune to this problem. See Marciano et al. (2022) for some work in this direction.
Also, Active Inference is probably not very illuminating model in discussing access consciousness and memory (some authors, e.g., Whyte & Smith (2020), claim that it is illuminating, at least partially, but it definitely not the whole story). (Phenomenal consciousness is more debatable, and I tend to think that Active Inference is a rather illuminating model for this phenomenon.)
I know too little about brain regions, neurobiological mechanisms, and the like, but I guess it’s possible (maybe not today, but maybe today, I really don’t know enough to judge, but you can tell better; also, these may have already been found, but I don’t know the literature) to find these mechanisms and brain regions that play the role during “context switches”, or in ensuring long-range distributed control, as discussing above. Or access consciousness and memory.
I hope from the exposition above it should be clear that you couldn’t quite factor Active Inference into a subsystem of the brain/mind (unless under “multiple Active Inference models with context switches” model of the mind, but, as I noted above, I thing this would be a rather iffy model to begin with). I would rather say: Active Inference still as a “framework” model with certain “extra Act Inf” pieces (such as access consciousness and memory) “attached” to it, plus other models (distributed control, and maybe there are some other models, I didn’t think about it deeply) that don’t quite cohere with Active Inference altogether and thus we сan only resort to modelling the brain/mind “as either one or another”, getting predictions, and comparing them.
Ok, I did that. My own understanding (and appreciation of limitations) of Active Inference progressed significantly, and changed considerably in a matter of the last several weeks alone, so most of my earlier blog posts could be significantly wrong or misleading in important aspects (and, I think, even my current writing is significantly wrong and misleading, because this process of re-appreciation of Active Inference is not settled for me yet).
I would also encourage you to read about thermodynamic machine learning and MCR^2 theories. (I want to do this myself, but still haven’t.)
It’s cool that you’re treating Active Inference as a specific model that might or might not apply to particular situations, organisms, brain regions, etc. In fact, that arguably puts you outside the group of people / papers that this blog post is even criticizing in the first place—see Section 0.
A thing that puzzles me, though, is your negative reactions to Sections 3 & 4. From this thread, it seems to me that your reaction to Section 3 should have been:
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then obviously this is a feedback control system. So anyone who uses Active Inference language to talk about this system, like by saying that it’s ‘predicting’ that the room temperature will stay constant, is off their rocker! And… EITHER …that position is a straw-man, nobody actually says things like that! OR …people do say that, and I join you in criticizing them!”
And similarly for Section 4, for a system that is actually, mechanistically, straightforwardly based on an RL algorithm.
But that wasn’t your reaction, right? Why not? Was it just because you misunderstood my post? Or what’s going on?
I thought your post is an explanation of why you don’t find Active Inference a useful theory/model, rather than criticism of people. I mean, it sort of criticises authors of the papers on FEP for various reasons, but who cares? I care whether the model is useful or not, not whether people who proposed the theory were clear in their earlier writing (as long as you are able to arrive at the actual understanding of the theory). I didn’t see this as a central argument.
So, my original reaction to 3 (the root comment in this thread) was about the usefulness of the theory (vs control theory), not about people.
Re: 4, I already replied that I misunderstood your “mechanistical lizard” assumption. So only the first part of my original reply to 4 (about ontology and conceptualisation, but also about interpretability, communication, hierarchical composability, which I didn’t mention originally, but that is discussed at length in “Designing Ecosystems of Intelligence from First Principles” (Friston et al., Dec 2022)). Again, these are arguments about the usefulness of the model, not about criticising people.
Sorry, I’ll rephrase. I expect you to agree with the following; do you?
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then this particular system is a feedback control system. And the most useful way to model it and to think about it is as a feedback control system. It would be unhelpful (or maybe downright incorrect?) to call this particular system an Active Inference system, and to say that it’s ‘predicting’ that the room temperature will stay constant.”
Unhelpful—yes.
“Downright incorrect”—no, because Active Inference model would be simply a mathematical generalisation of (simple) feedback control model in a thermostat. The implication “thermostat is a feedback control system” → “thermostat is an Active Inference agent” has the same “truth property” (sorry, I don’t know the correct term for this in logic) as the implication “A is a group” → “A is a semigroup”. Just a strict mathematical model generalisation.
“and to say that it’s ‘predicting’ that the room temperature will stay constant.”—no, it doesn’t predict predict specifically that “temperature will stay constant”. It predicts (or, “has preference for”) a distribution of temperature states of the room. And tries to act so as the actual distribution of these room temperatures matches this predicted distribution.