Tommer Argaman comments on Research Agenda v0.9: Synthesising a human’s preferences into a utility function

Tommer Argaman 17 Feb 2022 21:45 UTC
3 points
Thank you for writing this, It has been very informative and thought provoking to read.
This requires identifying the actual and hypothetical internal variables of a human, and thus solving the “symbol grounding problem” for humans; ways of doing that are proposed.
It seems to me that in order to do what this research agenda suggests, one would need to not only solve the “symbol grounding problem” for humans, but also solve it for the FSI in question; that is, we must know how to integrate partial preferences into $U_{H}$ in a way that allows them to be relevant to the world through the FSI’s model of reality, body, and actions. Just solving symbol grounding for humans does not promise us this, and thus does not promise us that our extraction of partial preferences will give a coherent agent. In more straightforward ML scenarios, such as in a game, the symbol grounding problem for the agent is automatically solved, to a certain degree and with respect to the game it is developed to play, in a holistic way that takes into account its actions and “senses”—and to us this looks like a black box. Does this research direction have a solution to this problem, which could seemingly alternatively be (partially) overcome by blunt RL in the real world?
Also, if we have a solution to the symbol grounding problem for both humans and the FSI, couldn’t we just demand that $U_{H}$ reflect the grounding of what “human values” mean to humans in the internal symbol language of the FSI, without need for worrying about specific partial preferences?
I am curious to know what you think of the existence of a second “symbol grounding problem”—under what circumstances can we solve it, and do we get an immediate solution for $U_{H}$ if it is solved?
Please let me know if I missed something relevant rendering my questions silly.