A month after writing my post on Focus as a crucial component of goal-directedness, I think I see things clearer about its real point. You can decompose the proposal in my post into two main ideas:
How much a system S is trying to accomplish a goal G can be captured by the distance of S to the set of policies maximally goal-directed towards G.
The set of policies maximally directed towards G is the set of policies trained by RL (for every amount of resource above a threshold) on the reward corresponding to G.
The first idea is what focus is really about. The second doesn’t work as well, and multiple people pointed out issues with it. But I still find powerful the idea that focus on a goal mesures the similarity with a set of policy that only try to accomplish this goal.
Now the big question left is: can we define the set of policies maximally goal-directed towards G in a clean way that captures our intuitions?
A month after writing my post on Focus as a crucial component of goal-directedness, I think I see things clearer about its real point. You can decompose the proposal in my post into two main ideas:
How much a system S is trying to accomplish a goal G can be captured by the distance of S to the set of policies maximally goal-directed towards G.
The set of policies maximally directed towards G is the set of policies trained by RL (for every amount of resource above a threshold) on the reward corresponding to G.
The first idea is what focus is really about. The second doesn’t work as well, and multiple people pointed out issues with it. But I still find powerful the idea that focus on a goal mesures the similarity with a set of policy that only try to accomplish this goal.
Now the big question left is: can we define the set of policies maximally goal-directed towards G in a clean way that captures our intuitions?