Thank you for writing this, It has been very informative and thought provoking to read.
This requires identifying the actual and hypothetical internal variables of a human, and thus solving the “symbol grounding problem” for humans; ways of doing that are proposed.
It seems to me that in order to do what this research agenda suggests, one would need to not only solve the “symbol grounding problem” for humans, but also solve it for the FSI in question; that is, we must know how to integrate partial preferences into in a way that allows them to be relevant to the world through the FSI’s model of reality, body, and actions. Just solving symbol grounding for humans does not promise us this, and thus does not promise us that our extraction of partial preferences will give a coherent agent. In more straightforward ML scenarios, such as in a game, the symbol grounding problem for the agent is automatically solved, to a certain degree and with respect to the game it is developed to play, in a holistic way that takes into account its actions and “senses”—and to us this looks like a black box. Does this research direction have a solution to this problem, which could seemingly alternatively be (partially) overcome by blunt RL in the real world?
Also, if we have a solution to the symbol grounding problem for both humans and the FSI, couldn’t we just demand that reflect the grounding of what “human values” mean to humans in the internal symbol language of the FSI, without need for worrying about specific partial preferences?
I am curious to know what you think of the existence of a second “symbol grounding problem”—under what circumstances can we solve it, and do we get an immediate solution for if it is solved?
Please let me know if I missed something relevant rendering my questions silly.
According to the view presented here, we cannot rule out that there are other parts of the nervous system that may maintain minimal separate foci of consciousness; each of these foci would have its own DSS, and they would not include each other in their DSSs (otherwise they would share a single global awareness). Since the ability to dictate speech is normally conserved for the main DSS, the contents of an additional DSS won’t be relevant for speech unless the main DSS is affected by the side DSS and decides to speak becuase of this dependence. It is possible for two DSSs to affect each other without sharing global awareness, as happens for example when two people are talking. In this case, the conditions for inclusion in one DSS won’t be fulfilled by parts of the two different DSSs with respect to each other; particularly the characteristic interaction time would be too long for inclusion in a single DSS.