My personal hot take on this is that a ‘goal-reaching’ value function should have a specific mathematical structure, so you should be able to infer the goal by simply studying the structure of the value function (and e.g. doing hill climbing to find the highest point). There are a lot of problems / difficulties with doing this, but something like this is the only way I think this can be done generally
Since you didn’t reference it, I’ll point out that there is already work on identifying goals in chess agents (1-step ahead).
My personal hot take on this is that a ‘goal-reaching’ value function should have a specific mathematical structure, so you should be able to infer the goal by simply studying the structure of the value function (and e.g. doing hill climbing to find the highest point). There are a lot of problems / difficulties with doing this, but something like this is the only way I think this can be done generally