adamShimi comments on Focus: you are allowed to be bad at accomplishing your goals

adamShimi 9 Jun 2020 14:39 UTC
3 points
I think rather than saying “The focus of S towards G is F”, I’d want to say something like “S is consistent with a focus F towards G”. In particular, any S is currently going to count as maximally focused towards many goals. Saying it’s maximally focusing on each of them feels strange. Saying its actions are consistent with maximal focus on any one of them feels more reasonable.
Honestly I don’t really care about the words used, more the formalism behind it. I personally don’t have any problem with saying that the system is maximally focused on multiple goals—I see focus as measuring “what proportion of my actions are coherent with trying to accomplish the goal”. But if many people find this weird, I’m okay with changing the phrasing.
E.g. if we have to visit all states in Go, that’s too strict: not because it’s intractable, but because once you’ve visited all those states you’ll be extremely capable. If we’re finding a sequence v(i) of value function approximations for Go, then it’s not strict enough. E.g. requiring only that for each state S we can find N such that there are some v(i)(S) != v(j)(S) with i, j < N.
I don’t yet see a good general condition.
Yes, as I mentioned in another comment, I’m not convinced anymore by this condition. And I don’t have a decent alternative yet.
Perhaps it’d be better to define G not as a set of states in one fixed environment, but as a function from environments to sets of states? (was this your meaning anyway? IIRC this is close to one of Michele’s setups)
This way you can say that my policy is focused if for any given environment, it’s close to the outcome of non-trivial RL training within that environment. (probably you’d define a system’s focus as 1/(max distance from Pol over all environments))
I like this idea, although I fail to see how it “solves” your problem with “A and B”. I think I get the intuition: in some environments, it will be easier to reach B than A. And if your system aims towards A instead of “A and B”, this might make it less focused towards “A and B” in these environments. But even then, the fact remains that $\forall G^{'} \supseteq G$ , the focus towards $G^{'}$ is always greater or equal to the focus towards $G$ . This is why I stand by my measure of triviality, or more intuitively a weight inversely proportional to the size of the goal.