In section 4, you discuss two different things, that ought to be discussed separately. The first thing is the discussion of whether thinking about the systems that are explicitly engineered as RL agents (or, generally, with any other explicit AI architecture apart from the Active Inference architecture itself) is useful:
4. Likewise, it’s easier to think of a reinforcement learning (RL) system as an RL system, and not as an active inference system
[...] actual RL practitioners almost universally don’t find the galaxy-brain perspective to be helpful.
I would say that whether it’s “easier to think” about RL agents as Active Inference agents (which you can do, see below) depends on what you are thinking about, exactly.
I think there is one direction of thinking that is significantly aided by applying the Active Inference perspective: it’s thinking about the ontology of agency (goals, objectives, rewards, optimisers and optimisation targets, goal-directedness, self-awareness, and related things). Under the Active Inference ontology, all these concepts that keep bewildering and confusing people on LW/AF and beyond acquire quite straightforward interpretations. Goals are just beliefs about the future. Rewards are constraints on the physical dynamics of the system that in turn lead to shaping this-or-that beliefs, as per the FEP and CMEP (Ramstead et al., 2023). Goal-directedness is a “strange loop” belief that one is an agent with goals[1]. (I’m currently writing an article where I elaborate on all these interpretations.)
This ontology also becomes useful in discussing agency in LLMs, which is a very different architecture from RL agents. This ontology also saves one from ontological confusion wrt. agency (or lack thereof) in LLMs.
Second is the discussion of agency in systems that are not explicitly engineered as RL agents (or Active Inference agents, for that matter):
Consider a cold-blooded lizard that goes to warm spots when it feels cold and cold spots when it feels hot. Suppose (for the sake of argument) that what’s happening behind the scenes is an RL algorithm in its brain, whose reward function is external temperature when the lizard feels cold, and whose reward function is negative external temperature when the lizard feels hot.
We can talk about this in the “normal” way, as a certain RL algorithm with a certain reward function, as per the previous sentence.
…Or we can talk about this in the galaxy-brain “active inference” way, where the lizard is (implicitly) “predicting” that its body temperature will remain constant, and taking actions to make this “prediction” come true.
I claim that we should think about it in the normal way. I think that the galaxy-brain “active inference” perspective is just adding a lot of confusion for no benefit.
Imposing an RL algorithm on the dynamics of the lizard’s brain and body is no more justified than imposing the Active Inference algorithm on it. Therefore, there is no ground for calling the first “normal” and the second “galaxy brained”: it’s normal scientific work to find which algorithm predicts the behaviour of the lizard better.
There is a methodological reason to choose the Active Inference theory of agency, though: it is more generic[2]. Active Inference recovers RL (with or without entropy regularisation) as limit cases, but the inverse is not true:
We can spare the work of deciding whether a lizard acts as a maximum entropy RL agent or an Active Inference agent because, under the statistical limit of systems whose internal dynamics follow their path of least action exactly (such systems are called precise agents in Barp et al., Jul 2022 and conservative particles in Friston et al., Nov 2022) and whose sensory observations don’t exhibit random fluctuations, there is “no ambiguity” in the decision making under Active Inference (calling this “no ambiguity could be somewhat confusing, but it is what it is), and thus Active Inference becomes maximum entropy RL (Haarnoja et al., 2018) exactly. So, you can think of a lizard (or a human, of course) as a maximum entropy RL agent, it’s conformant with Active Inference.
Cf. Joscha Bach’s description of the formation of this “strange loop” belief in biological organisms: “We have a loop between our intentions and the actions that we perform that our body executes, and the observations that we are making and the feedback that they have on our interoception giving rise to new intentions. And only in the context of this loop, I believe, can we discover that we have a body. The body is not given, it is discovered together with our intentions and our actions and the world itself. So, all these parts depend crucially on each other so that we can notice them. We basically discover this loop as a model of our own agency.”
Note, however, that there is no claim that Active Inference is the ultimately generic theory of agency. In fact, it is already clear that it is not the ultimately generic theory, because it doesn’t account for the fact that most cognitive systems won’t be able to combine all their beliefs into a single, Bayes-coherent multi-factor belief structure (the problem of intrinsic contextuality). The “ultimate” decision theory should be quantum. See Basieva et al. (2021), Pothos & Busemeyer (2022), and Fields & Glazebrook (2022) for some recent reviews, and Fields et al. (2022a) and Tanaka et al. (2022) for examples of recent work. Active Inference could, perhaps, still serve as a useful tool for ontologising the effectively classic agency, or a Bayes-coherent “thread of agency” by a system. However, I regard the problem of intrinsic contextuality as the main “threat” to the FEP and Active Inference. The work for updating the FEP theory so that it accounts for intrinsic contextuality has recently started (Fields et al., 2022a; 2022b).
it’s normal scientific work to find which algorithm predicts the behaviour of the lizard better
I’m confused. Everyone including Friston says FEP is an unfalsifiable tautology—if a lizard brain does X, and it’s not totally impossible for such a lizard to remain alive and have bodily integrity, then the prediction of FEP is always “Yes it is possible for the lizard brain to do X”.
The way you’re talking here seems to suggest that FEP is making more specific predictions than that, e.g. you seem to be implying that there’s such a thing as an “Active Inference agent” that is different from an RL agent (I think?), which would mean that you are sneaking in additional hypotheses beyond FEP itself, right? If so, what are those hypotheses? Or sorry if I’m misunderstanding.
The way you’re talking here seems to suggest that FEP is making more specific predictions than that, e.g. you seem to be implying that there’s such a thing as an “Active Inference agent” that is different from an RL agent (I think?), which would mean that you are sneaking in additional hypotheses beyond FEP itself, right? If so, what are those hypotheses? Or sorry if I’m misunderstanding.
Yes, this is right. FEP != Active Inference. I deal with this at length in another comment. Two additional assumptions are: 1) the agent “holds” (in some sense, whether representationalist or enactivist is less important here) beliefs about the future, whereas in FEP, the system only holds beliefs about the present. 2) The agent selects action so as to minimise expected free energy (EFE) wrt. these beliefs about the future. (Action selection, or decision making, is entirely absent in the FEP, which is simply a dynamicist “background” for Active Inference, which simply explains what these beliefs in Active Inference are; when you start talking about action selection or decision making, you imply causal symmetry breaking)
Can you give a concrete example of a thing that has homeostasis / bodily integrity / etc. (and therefore FEP applies to it), but for which it is incorrect (not just unhelpful but actually technically incorrect) to call this thing an Active Inference agent?
(I would have guessed that “a room with a mechanical thermostat & space heater” would be an example of that, i.e. a thing for which FEP applies but which is NOT an Active Inference agent. But nope! Your other comment says that “a room with a mechanical thermostat & space heater” is in fact an Active Inference agent.)
Imposing an RL algorithm on the dynamics of the lizard’s brain and body is no more justified than imposing the Active Inference algorithm on it.
I think you misunderstood (or missed) the part where I wrote “Suppose (for the sake of argument) that what’s happening behind the scenes is an RL algorithm in its brain, whose reward function is external temperature when the lizard feels cold, and whose reward function is negative external temperature when the lizard feels hot.”
What I’m saying here is that RL is not a thing I am “imposing” on the lizard brain—it’s how the brain actually works (in this for-the-sake-of-argument hypothetical).
Pick your favorite RL algorithm—let’s say PPO. And imagine that when we look inside this lizard brain we find every step of the PPO algorithm implemented in neurons in a way that exactly parallels, line-by-line, how PPO works in the textbooks. “Aha”, you say, “look at the pattern of synapses in this group of 10,000 neurons, this turns out to be exactly how you would wire together neurons to calculate the KL divergence of (blah blah blah). And look at that group of neurons! It is configured in the exact right way to double the β parameter when the divergence is too high. And look at …” etc. etc.
Is this realistic? No. Lizard brains do not literally implement the PPO algorithm. But they could in principle, and if they did, we would find that the lizards move around in a way that effectively maintains their body temperature. And FEP would apply to those hypothetical lizard brains, just like FEP applies by definition to everything with bodily integrity etc. But here we can say that the person who says “the lizard brain is running an RL algorithm, namely PPO with thus-and-such reward function” is correctly describing a gears-level model of this hypothetical lizard brain. They are not “imposing” anything! Whereas the person who says “the lizard is ‘predicting’ that its body temperature will be constant” is not doing that. The latter person is much farther away from understanding this hypothetical lizard brain than the former person, right?
Yes, I mentally skipped the part when you created “artificial lizard with RL architecture” (that was unexpected). Then, the argument collapses to the first part of the comment to which you are replying: gears-level is more precise, of course, but “birds-eye view” of Active Inference could give you the concepts for thinking about agency, persuadability (aka corrigibility), etc., without the need to re-invent them, and without spawning a plethora of concepts which don’t make sense in the abstract and are specific for each AI algorithm/architecture (such as, the concept of “reward” is not a fundamental concept of alignment, because it applies to RL agents, but doesn’t apply to LLMs, which are also agents).
In section 4, you discuss two different things, that ought to be discussed separately. The first thing is the discussion of whether thinking about the systems that are explicitly engineered as RL agents (or, generally, with any other explicit AI architecture apart from the Active Inference architecture itself) is useful:
I would say that whether it’s “easier to think” about RL agents as Active Inference agents (which you can do, see below) depends on what you are thinking about, exactly.
I think there is one direction of thinking that is significantly aided by applying the Active Inference perspective: it’s thinking about the ontology of agency (goals, objectives, rewards, optimisers and optimisation targets, goal-directedness, self-awareness, and related things). Under the Active Inference ontology, all these concepts that keep bewildering and confusing people on LW/AF and beyond acquire quite straightforward interpretations. Goals are just beliefs about the future. Rewards are constraints on the physical dynamics of the system that in turn lead to shaping this-or-that beliefs, as per the FEP and CMEP (Ramstead et al., 2023). Goal-directedness is a “strange loop” belief that one is an agent with goals[1]. (I’m currently writing an article where I elaborate on all these interpretations.)
This ontology also becomes useful in discussing agency in LLMs, which is a very different architecture from RL agents. This ontology also saves one from ontological confusion wrt. agency (or lack thereof) in LLMs.
Second is the discussion of agency in systems that are not explicitly engineered as RL agents (or Active Inference agents, for that matter):
Imposing an RL algorithm on the dynamics of the lizard’s brain and body is no more justified than imposing the Active Inference algorithm on it. Therefore, there is no ground for calling the first “normal” and the second “galaxy brained”: it’s normal scientific work to find which algorithm predicts the behaviour of the lizard better.
There is a methodological reason to choose the Active Inference theory of agency, though: it is more generic[2]. Active Inference recovers RL (with or without entropy regularisation) as limit cases, but the inverse is not true:
(Reproduced Figure 3 from Barp et al., Jul 2022.)
We can spare the work of deciding whether a lizard acts as a maximum entropy RL agent or an Active Inference agent because, under the statistical limit of systems whose internal dynamics follow their path of least action exactly (such systems are called precise agents in Barp et al., Jul 2022 and conservative particles in Friston et al., Nov 2022) and whose sensory observations don’t exhibit random fluctuations, there is “no ambiguity” in the decision making under Active Inference (calling this “no ambiguity could be somewhat confusing, but it is what it is), and thus Active Inference becomes maximum entropy RL (Haarnoja et al., 2018) exactly. So, you can think of a lizard (or a human, of course) as a maximum entropy RL agent, it’s conformant with Active Inference.
Cf. Joscha Bach’s description of the formation of this “strange loop” belief in biological organisms: “We have a loop between our intentions and the actions that we perform that our body executes, and the observations that we are making and the feedback that they have on our interoception giving rise to new intentions. And only in the context of this loop, I believe, can we discover that we have a body. The body is not given, it is discovered together with our intentions and our actions and the world itself. So, all these parts depend crucially on each other so that we can notice them. We basically discover this loop as a model of our own agency.”
Note, however, that there is no claim that Active Inference is the ultimately generic theory of agency. In fact, it is already clear that it is not the ultimately generic theory, because it doesn’t account for the fact that most cognitive systems won’t be able to combine all their beliefs into a single, Bayes-coherent multi-factor belief structure (the problem of intrinsic contextuality). The “ultimate” decision theory should be quantum. See Basieva et al. (2021), Pothos & Busemeyer (2022), and Fields & Glazebrook (2022) for some recent reviews, and Fields et al. (2022a) and Tanaka et al. (2022) for examples of recent work. Active Inference could, perhaps, still serve as a useful tool for ontologising the effectively classic agency, or a Bayes-coherent “thread of agency” by a system. However, I regard the problem of intrinsic contextuality as the main “threat” to the FEP and Active Inference. The work for updating the FEP theory so that it accounts for intrinsic contextuality has recently started (Fields et al., 2022a; 2022b).
I’m confused. Everyone including Friston says FEP is an unfalsifiable tautology—if a lizard brain does X, and it’s not totally impossible for such a lizard to remain alive and have bodily integrity, then the prediction of FEP is always “Yes it is possible for the lizard brain to do X”.
The way you’re talking here seems to suggest that FEP is making more specific predictions than that, e.g. you seem to be implying that there’s such a thing as an “Active Inference agent” that is different from an RL agent (I think?), which would mean that you are sneaking in additional hypotheses beyond FEP itself, right? If so, what are those hypotheses? Or sorry if I’m misunderstanding.
Yes, this is right. FEP != Active Inference. I deal with this at length in another comment. Two additional assumptions are: 1) the agent “holds” (in some sense, whether representationalist or enactivist is less important here) beliefs about the future, whereas in FEP, the system only holds beliefs about the present. 2) The agent selects action so as to minimise expected free energy (EFE) wrt. these beliefs about the future. (Action selection, or decision making, is entirely absent in the FEP, which is simply a dynamicist “background” for Active Inference, which simply explains what these beliefs in Active Inference are; when you start talking about action selection or decision making, you imply causal symmetry breaking)
Can you give a concrete example of a thing that has homeostasis / bodily integrity / etc. (and therefore FEP applies to it), but for which it is incorrect (not just unhelpful but actually technically incorrect) to call this thing an Active Inference agent?
(I would have guessed that “a room with a mechanical thermostat & space heater” would be an example of that, i.e. a thing for which FEP applies but which is NOT an Active Inference agent. But nope! Your other comment says that “a room with a mechanical thermostat & space heater” is in fact an Active Inference agent.)
I think you misunderstood (or missed) the part where I wrote “Suppose (for the sake of argument) that what’s happening behind the scenes is an RL algorithm in its brain, whose reward function is external temperature when the lizard feels cold, and whose reward function is negative external temperature when the lizard feels hot.”
What I’m saying here is that RL is not a thing I am “imposing” on the lizard brain—it’s how the brain actually works (in this for-the-sake-of-argument hypothetical).
Pick your favorite RL algorithm—let’s say PPO. And imagine that when we look inside this lizard brain we find every step of the PPO algorithm implemented in neurons in a way that exactly parallels, line-by-line, how PPO works in the textbooks. “Aha”, you say, “look at the pattern of synapses in this group of 10,000 neurons, this turns out to be exactly how you would wire together neurons to calculate the KL divergence of (blah blah blah). And look at that group of neurons! It is configured in the exact right way to double the β parameter when the divergence is too high. And look at …” etc. etc.
Is this realistic? No. Lizard brains do not literally implement the PPO algorithm. But they could in principle, and if they did, we would find that the lizards move around in a way that effectively maintains their body temperature. And FEP would apply to those hypothetical lizard brains, just like FEP applies by definition to everything with bodily integrity etc. But here we can say that the person who says “the lizard brain is running an RL algorithm, namely PPO with thus-and-such reward function” is correctly describing a gears-level model of this hypothetical lizard brain. They are not “imposing” anything! Whereas the person who says “the lizard is ‘predicting’ that its body temperature will be constant” is not doing that. The latter person is much farther away from understanding this hypothetical lizard brain than the former person, right?
Yes, I mentally skipped the part when you created “artificial lizard with RL architecture” (that was unexpected). Then, the argument collapses to the first part of the comment to which you are replying: gears-level is more precise, of course, but “birds-eye view” of Active Inference could give you the concepts for thinking about agency, persuadability (aka corrigibility), etc., without the need to re-invent them, and without spawning a plethora of concepts which don’t make sense in the abstract and are specific for each AI algorithm/architecture (such as, the concept of “reward” is not a fundamental concept of alignment, because it applies to RL agents, but doesn’t apply to LLMs, which are also agents).