Vanessa Kosoy comments on FixDT

Vanessa Kosoy 18 Dec 2024 16:15 UTC
LW: 4 AF: 4
0
AF
This post proposes an approach to decision theory in which we notion of “actions” is emergent. Instead of having an ontologically fundamental notion of actions, the agent just has beliefs, and some of them are self-fulfilling prophecies. For example, the agent can discover that “whenever I believe my arm will move up/down, my arm truly moves up/down”, and then exploit this fact by moving the arm in the right direction to maximize utility. This works by having a “metabelief” (a mapping from beliefs to beliefs; my terminology, not the OP’s) and allowing the agent to choose its belief out of the metabelief fixed points.
The next natural question is then, can we indeed demonstrate that an agent will learn which part of the world it controls, under reasonable conditions. Abram implies that it should be possible if we only allow choice among attractive fixed point. He then bemoans the need for this restriction and tries to use ideas from Active Inference to fix it with limited success. Personally, I don’t understand what’s so bad with staying with the attractive fixed points.
Unfortunately, this post avoids spelling out a sequential version of the decision theory, which would be necessary to actually establish any learning-theoretic result. However, I think that I see how it can be done, and it seems to support Abram’s claims. Details follows.
Let’s suppose that the agent observes two systems, each of which can be in one of two positions. At each moment of time, it observes an element of $A \times B$ , where $| A | = | B | = 2$ . The agent beliefs it can control one of $A$ and $B$ whereas the other is a fair coin. However, it doesn’t know which is which.
In this case, metabeliefs are mappings of type $θ : Δ (A \times B)^{ω} \to Δ (A \times B)^{ω}$ . Specifically, we have a hypothesis $α$ that asserts $A$ is controllable, a hypothesis $β$ that asserts $B$ is controllable and the overall metabelief is (say) $ζ = \frac{1}{2} α + \frac{1}{2} β$ .
The hypothesis $α$ is defined by
$α (q; i j ∣ h) := \frac{1}{2} f (q (i ∣ h))$
Here, $q \in Δ (A \times B)^{ω}$ , $i \in A$ , $j \in B$ , $h \in (A \times B)^{*}$ and $f : [0, 1] \to [0, 1]$ is some “motor response function”, e.g. $f (x) := \frac{1}{2} + \frac{1}{2} \sqrt[3]{2 x - 1}$ .
Similarly, $β$ is defined by
$β (q; i j ∣ h) := \frac{1}{2} f (q (j ∣ h))$
Now, let $ξ \in Δ (A \times B)^{ω}$ be an attractive fixed point of $ζ$ and consider some history $h \in (A \times B)^{*}$ . If the statistics of $A$ in $h$ seem biased towards $ξ$ whereas the statistics of $B$ in $h$ seem like a fair coin, then the likelihoods will satisfy $α (ξ; h) ≫ β (ξ; h)$ and hence $ζ (ξ; i ∣ h) = ξ (i ∣ h)$ will be close to $α (ξ; i ∣ h) = f (ξ (i ∣ h))$ and therefore will be close to ${0, 1}$ (since $ξ$ is an attractive fixed point). On the other hand, in the converse situation, the likelihoods will satisfy $α (ξ; h) ≪ β (ξ; h)$ and hence $ζ (ξ; i ∣ h) = ξ (i ∣ h)$ will be close to $β (ξ; i ∣ h) = \frac{1}{2}$ . Hence, the agent effectively updates on the observed history and will choose some fixed point $ξ^{*}$ which controls the available degrees of freedom correctly.
Notice that all of this doesn’t work with repelling fixed points. Indeed, if we used $f (x) := \frac{1}{2} + \frac{1}{2} (2 x - 1)^{3}$ then $ζ$ would have a unique fixed point and there would be nothing to choose.
I find these ideas quite intriguing and am likely to keep thing about them!