The thing I’m advocating for is: if a system has seven different interconnected components, we should think of it as a system with seven different interconnected components, and we can have nice conversations about what those components are and how they work together and interact, and what is the system going to do at the end of the day in different situations.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”? Beats me, I don’t know what that means. You give examples like how a GAN has two DNN components but is also “a unified system”, and LeCun’s H-JEPA has “many distinct DNN components” but is also “a unified system”, if I understand you correctly. OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Now, if the brain in fact has a number of different interconnected components doing different things, then we can still model it as a simpler system with fewer components, if we want to. The model will presumably capture some aspects / situations well, and others poorly—as all models do.
If that’s how we’re thinking about Active Inference—an imperfect model of a more complicated system—then I would expect to see people say things like the following:
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F” [and here’s how I know that…]
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”?
“Being” a unified system has realist connotations. Let’s stay on the instrumentalist side, at least for now. I want to analyse the brain (or, the whole human, really) as a unified system, and assume an intentional stance towards it, and build its theory of mind, all because I want to predict its behaivour as a whole.
OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Yes, it can and should be analysed as a unified system, to the extent possible and reasonable (not falling into over-extending theories by concocting components in an artificial, forced way).
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F”
Active Inference is an illuminating and misleading model of Homo Sapiens simultaneously, in a similar way as Newtonian physics is both an illuminating and misleading theory of planetary motion. I explained why it’s misleading: it doesn’t account for intrinsic contextuality in inference and decision-making (see my other top-level comment on this). The austere instrumentalist response to it would be “OK, let’s, as scientists, model the same organism as two separate Active Inference models at the same time, and decide ‘which one acts at the moment’ based on the context”. While this approach could also be illuminating and instrumentally useful, I agree it beings to be very slippery.
Also, Active Inference is too simple/general (I suspect, but not sure) to recover some more sophisticated distributed control characteristics of the human brain (or, possibly, AI systems, for that matter), as I discussed above.
Active Inference is not a “very misleading” (“degree of misleadingness” is not a precise thing, but I hope you get it) of any “normal” agents we can think of (well, barring some extreme instances of quantumness/contextuality, a-la agentic quantum AI, but let’s not go there) precisely because it’s very generic, as I discuss in this comment. By this token, maximum entropy RL would be a misleading model of an agent hinging on the boundary between the scale where random fluctuations do and don’t make a difference (this should be a scale smaller than cellular, because probably random (quantum) fluctuations already don’t make difference on the cellular scale; because the scale is so small, it’s also hard to construct anywhere “smart” agent on such a small scale).
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
See above, pretty much. Active Inference is not a good model for decision making under cognitive dissonance, i.e., intrinsic contextuality (Fields typically cites Dzhafarov for psychological evidence of this).
Note that intrinsic contextuality is not a problem for any theory of agency, it’s more endemic to Active Inference than (some) other theories. E.g., connectionist models where NNs are first-class citizens of the theory are immune to this problem. See Marciano et al. (2022) for some work in this direction.
Also, Active Inference is probably not very illuminating model in discussing access consciousness and memory (some authors, e.g., Whyte & Smith (2020), claim that it is illuminating, at least partially, but it definitely not the whole story). (Phenomenal consciousness is more debatable, and I tend to think that Active Inference is a rather illuminating model for this phenomenon.)
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
I know too little about brain regions, neurobiological mechanisms, and the like, but I guess it’s possible (maybe not today, but maybe today, I really don’t know enough to judge, but you can tell better; also, these may have already been found, but I don’t know the literature) to find these mechanisms and brain regions that play the role during “context switches”, or in ensuring long-range distributed control, as discussing above. Or access consciousness and memory.
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
I hope from the exposition above it should be clear that you couldn’t quite factor Active Inference into a subsystem of the brain/mind (unless under “multiple Active Inference models with context switches” model of the mind, but, as I noted above, I thing this would be a rather iffy model to begin with). I would rather say: Active Inference still as a “framework” model with certain “extra Act Inf” pieces (such as access consciousness and memory) “attached” to it, plus other models (distributed control, and maybe there are some other models, I didn’t think about it deeply) that don’t quite cohere with Active Inference altogether and thus we сan only resort to modelling the brain/mind “as either one or another”, getting predictions, and comparing them.
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
Ok, I did that. My own understanding (and appreciation of limitations) of Active Inference progressed significantly, and changed considerably in a matter of the last several weeks alone, so most of my earlier blog posts could be significantly wrong or misleading in important aspects (and, I think, even my current writing is significantly wrong and misleading, because this process of re-appreciation of Active Inference is not settled for me yet).
I would also encourage you to read about thermodynamic machine learning and MCR^2 theories. (I want to do this myself, but still haven’t.)
It’s cool that you’re treating Active Inference as a specific model that might or might not apply to particular situations, organisms, brain regions, etc. In fact, that arguably puts you outside the group of people / papers that this blog post is even criticizing in the first place—see Section 0.
A thing that puzzles me, though, is your negative reactions to Sections 3 & 4. From this thread, it seems to me that your reaction to Section 3 should have been:
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then obviously this is a feedback control system. So anyone who uses Active Inference language to talk about this system, like by saying that it’s ‘predicting’ that the room temperature will stay constant, is off their rocker! And… EITHER …that position is a straw-man, nobody actually says things like that! OR …people do say that, and I join you in criticizing them!”
And similarly for Section 4, for a system that is actually, mechanistically, straightforwardly based on an RL algorithm.
But that wasn’t your reaction, right? Why not? Was it just because you misunderstood my post? Or what’s going on?
I thought your post is an explanation of why you don’t find Active Inference a useful theory/model, rather than criticism of people. I mean, it sort of criticises authors of the papers on FEP for various reasons, but who cares? I care whether the model is useful or not, not whether people who proposed the theory were clear in their earlier writing (as long as you are able to arrive at the actual understanding of the theory). I didn’t see this as a central argument.
So, my original reaction to 3 (the root comment in this thread) was about the usefulness of the theory (vs control theory), not about people.
Re: 4, I already replied that I misunderstood your “mechanistical lizard” assumption. So only the first part of my original reply to 4 (about ontology and conceptualisation, but also about interpretability, communication, hierarchical composability, which I didn’t mention originally, but that is discussed at length in “Designing Ecosystems of Intelligence from First Principles” (Friston et al., Dec 2022)). Again, these are arguments about the usefulness of the model, not about criticising people.
Sorry, I’ll rephrase. I expect you to agree with the following; do you?
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then this particular system is a feedback control system. And the most useful way to model it and to think about it is as a feedback control system. It would be unhelpful (or maybe downright incorrect?) to call this particular system an Active Inference system, and to say that it’s ‘predicting’ that the room temperature will stay constant.”
“Downright incorrect”—no, because Active Inference model would be simply a mathematical generalisation of (simple) feedback control model in a thermostat. The implication “thermostat is a feedback control system” → “thermostat is an Active Inference agent” has the same “truth property” (sorry, I don’t know the correct term for this in logic) as the implication “A is a group” → “A is a semigroup”. Just a strict mathematical model generalisation.
“and to say that it’s ‘predicting’ that the room temperature will stay constant.”—no, it doesn’t predict predict specifically that “temperature will stay constant”. It predicts (or, “has preference for”) a distribution of temperature states of the room. And tries to act so as the actual distribution of these room temperatures matches this predicted distribution.
I’m pretty confused here.
The thing I’m advocating for is: if a system has seven different interconnected components, we should think of it as a system with seven different interconnected components, and we can have nice conversations about what those components are and how they work together and interact, and what is the system going to do at the end of the day in different situations.
I believe that the brain has a number of different interconnected components, including parts that implement feedback control (cf. the leptin example above) and parts that run an RL algorithm and maybe parts that run a probabilistic inference algorithm and so on. Is that belief of mine compatible with the brain being “a unified system”? Beats me, I don’t know what that means. You give examples like how a GAN has two DNN components but is also “a unified system”, and LeCun’s H-JEPA has “many distinct DNN components” but is also “a unified system”, if I understand you correctly. OK, then my proposed multi-component brain can also be “a unified system”, right? If not, what’s the difference?
Now, if the brain in fact has a number of different interconnected components doing different things, then we can still model it as a simpler system with fewer components, if we want to. The model will presumably capture some aspects / situations well, and others poorly—as all models do.
If that’s how we’re thinking about Active Inference—an imperfect model of a more complicated system—then I would expect to see people say things like the following:
“Active inference is an illuminating model for organisms A,B,C but a misleading model for organisms D,E,F” [and here’s how I know that…]
“Active inference is an illuminating model for situations / phenomena A,B,C but a misleading model for situations / phenomena D,E,F”
“Active inference is an illuminating model for brain regions A,B,C but a misleading model for brain regions D,E,F”
“…And that means that we can build a more accurate model of the brain by treating it as an Active Inference subsystem (i.e. for regions A,B,C) attached to a second non-Active-Inference subsystem (i.e. for regions D,E,F). The latter performing functions X,Y,Z and the two subsystems are connected as follows….”
My personal experience of the Active Inference literature is very very different from that! People seem to treat Active Inference as an infallible and self-evident and omni-applicable model. If you don’t think of it that way, then I’m happy to hear that, but I’m afraid that it doesn’t come across in your blog posts, and I vote for you to put in more explicit statements along the lines of those bullet points above.
“Being” a unified system has realist connotations. Let’s stay on the instrumentalist side, at least for now. I want to analyse the brain (or, the whole human, really) as a unified system, and assume an intentional stance towards it, and build its theory of mind, all because I want to predict its behaivour as a whole.
Yes, it can and should be analysed as a unified system, to the extent possible and reasonable (not falling into over-extending theories by concocting components in an artificial, forced way).
Active Inference is an illuminating and misleading model of Homo Sapiens simultaneously, in a similar way as Newtonian physics is both an illuminating and misleading theory of planetary motion. I explained why it’s misleading: it doesn’t account for intrinsic contextuality in inference and decision-making (see my other top-level comment on this). The austere instrumentalist response to it would be “OK, let’s, as scientists, model the same organism as two separate Active Inference models at the same time, and decide ‘which one acts at the moment’ based on the context”. While this approach could also be illuminating and instrumentally useful, I agree it beings to be very slippery.
Also, Active Inference is too simple/general (I suspect, but not sure) to recover some more sophisticated distributed control characteristics of the human brain (or, possibly, AI systems, for that matter), as I discussed above.
Active Inference is not a “very misleading” (“degree of misleadingness” is not a precise thing, but I hope you get it) of any “normal” agents we can think of (well, barring some extreme instances of quantumness/contextuality, a-la agentic quantum AI, but let’s not go there) precisely because it’s very generic, as I discuss in this comment. By this token, maximum entropy RL would be a misleading model of an agent hinging on the boundary between the scale where random fluctuations do and don’t make a difference (this should be a scale smaller than cellular, because probably random (quantum) fluctuations already don’t make difference on the cellular scale; because the scale is so small, it’s also hard to construct anywhere “smart” agent on such a small scale).
See above, pretty much. Active Inference is not a good model for decision making under cognitive dissonance, i.e., intrinsic contextuality (Fields typically cites Dzhafarov for psychological evidence of this).
Note that intrinsic contextuality is not a problem for any theory of agency, it’s more endemic to Active Inference than (some) other theories. E.g., connectionist models where NNs are first-class citizens of the theory are immune to this problem. See Marciano et al. (2022) for some work in this direction.
Also, Active Inference is probably not very illuminating model in discussing access consciousness and memory (some authors, e.g., Whyte & Smith (2020), claim that it is illuminating, at least partially, but it definitely not the whole story). (Phenomenal consciousness is more debatable, and I tend to think that Active Inference is a rather illuminating model for this phenomenon.)
I know too little about brain regions, neurobiological mechanisms, and the like, but I guess it’s possible (maybe not today, but maybe today, I really don’t know enough to judge, but you can tell better; also, these may have already been found, but I don’t know the literature) to find these mechanisms and brain regions that play the role during “context switches”, or in ensuring long-range distributed control, as discussing above. Or access consciousness and memory.
I hope from the exposition above it should be clear that you couldn’t quite factor Active Inference into a subsystem of the brain/mind (unless under “multiple Active Inference models with context switches” model of the mind, but, as I noted above, I thing this would be a rather iffy model to begin with). I would rather say: Active Inference still as a “framework” model with certain “extra Act Inf” pieces (such as access consciousness and memory) “attached” to it, plus other models (distributed control, and maybe there are some other models, I didn’t think about it deeply) that don’t quite cohere with Active Inference altogether and thus we сan only resort to modelling the brain/mind “as either one or another”, getting predictions, and comparing them.
Ok, I did that. My own understanding (and appreciation of limitations) of Active Inference progressed significantly, and changed considerably in a matter of the last several weeks alone, so most of my earlier blog posts could be significantly wrong or misleading in important aspects (and, I think, even my current writing is significantly wrong and misleading, because this process of re-appreciation of Active Inference is not settled for me yet).
I would also encourage you to read about thermodynamic machine learning and MCR^2 theories. (I want to do this myself, but still haven’t.)
It’s cool that you’re treating Active Inference as a specific model that might or might not apply to particular situations, organisms, brain regions, etc. In fact, that arguably puts you outside the group of people / papers that this blog post is even criticizing in the first place—see Section 0.
A thing that puzzles me, though, is your negative reactions to Sections 3 & 4. From this thread, it seems to me that your reaction to Section 3 should have been:
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then obviously this is a feedback control system. So anyone who uses Active Inference language to talk about this system, like by saying that it’s ‘predicting’ that the room temperature will stay constant, is off their rocker! And… EITHER …that position is a straw-man, nobody actually says things like that! OR …people do say that, and I join you in criticizing them!”
And similarly for Section 4, for a system that is actually, mechanistically, straightforwardly based on an RL algorithm.
But that wasn’t your reaction, right? Why not? Was it just because you misunderstood my post? Or what’s going on?
I thought your post is an explanation of why you don’t find Active Inference a useful theory/model, rather than criticism of people. I mean, it sort of criticises authors of the papers on FEP for various reasons, but who cares? I care whether the model is useful or not, not whether people who proposed the theory were clear in their earlier writing (as long as you are able to arrive at the actual understanding of the theory). I didn’t see this as a central argument.
So, my original reaction to 3 (the root comment in this thread) was about the usefulness of the theory (vs control theory), not about people.
Re: 4, I already replied that I misunderstood your “mechanistical lizard” assumption. So only the first part of my original reply to 4 (about ontology and conceptualisation, but also about interpretability, communication, hierarchical composability, which I didn’t mention originally, but that is discussed at length in “Designing Ecosystems of Intelligence from First Principles” (Friston et al., Dec 2022)). Again, these are arguments about the usefulness of the model, not about criticising people.
Sorry, I’ll rephrase. I expect you to agree with the following; do you?
“If you have an actual mechanical thermostat connected to an actual heater, and that’s literally the whole system, then this particular system is a feedback control system. And the most useful way to model it and to think about it is as a feedback control system. It would be unhelpful (or maybe downright incorrect?) to call this particular system an Active Inference system, and to say that it’s ‘predicting’ that the room temperature will stay constant.”
Unhelpful—yes.
“Downright incorrect”—no, because Active Inference model would be simply a mathematical generalisation of (simple) feedback control model in a thermostat. The implication “thermostat is a feedback control system” → “thermostat is an Active Inference agent” has the same “truth property” (sorry, I don’t know the correct term for this in logic) as the implication “A is a group” → “A is a semigroup”. Just a strict mathematical model generalisation.
“and to say that it’s ‘predicting’ that the room temperature will stay constant.”—no, it doesn’t predict predict specifically that “temperature will stay constant”. It predicts (or, “has preference for”) a distribution of temperature states of the room. And tries to act so as the actual distribution of these room temperatures matches this predicted distribution.