I kindly ask for your comments, questions and feedback of any kind.
Various predictive and rigorous mathematical (/logical/linguistic) frameworks exist for analyzing and implementing agent’s world MODELLING- especially that of causality (e.g. pearl’s structural causal model) -, their ACTION (e.g. decision theory, reinforcement learning) and some aspects of their COMMUNICATION (information theory; vocabulary, syntax, semantics, pragmatics). I believe one missing link between communication and the former two aspects of intelligent agents is a formal theory of NARRATIVES or stories: At least as far as I am aware, there is no good theory of why we share bits of world model and policy with one another in the way that we do.
This is a very rough dissatisfaction, sort of an itch for having one or several objective functions for narratives that are more specific than very general information theoretic criterions (which are usually agnostic of the structure of world model and policy). I have a myriad of questions that I want a good model of narratives to answer, and think sharing these is the best way to elucidate why I think a theory of narratives—or at least implementable, validatable models of specific aspects of them - is necessary.
The questions below are in not too much particular order, and at the moment still very rough. In fact some questions are so general that I dont expect that they can be answered at all; I still think they are useful/currently the best/only tools I have to make a sketch of the space I want to explore. In any case, I will probably follow up with a post where I try to lay out some terminology more rigorously; and then go into the first two questions more in-depth.
How exactly do narratives relate to causality? They seem to be situational compressions of internal models of causality, with the goal to efficiently inform w.r.t. a part of one’s world model or policy. Narratives thus seem to pick out relevant, relatively simple (acyclic?) subgraphs of a more complex internalized (cyclic?) causal graph.
Is the internal causal graph possibly cyclic due to the nodes (abstractions) grouping multiple individual events (which have a (possibly known) acyclic causal structure) into (classes/ensembles of) objects (that, individually and together, may persist across time), thus having multiple interactions with neighboring nodes (across time and the ensemble), in effect smearing out clear causal directionality until one is left with mere correlation? It seems as one goes higher up in an ontology, one sacrifices causal clarity for abstraction power, and ends up with correlation only. How can an ontology be constructed that respects causal direction optimally (w.r.t. policy communication)? How do you backtrack to a level of abstraction low enough to have a reliable enough causal picture, given what you want to communicate? (It seems that these questions should be treatable with Vapnik-Chervonenkis and Ergodic theory.)
How can a shared vocabulary emerge, and how do the agents validate their interpretations? How does the abstraction level of a given term/concept correlate with the number of different interpretations of their causal connection to other abstract terms by various language users? Does this come from these more coarse terms necessarily grouping together sets of correlates with different causality directions, thereby losing power in distinguishing causality directions? (Recall-precision tradeoff). Why is this a problem in the case of natural language, but not self-referential/mathematical languages?
What’s the “point” of a given story? Why do the interlocutors get it, but literal/formal semantics doesn’t? This is called pragmatics, and hinges on the interlocutors’ model of the world and one another’s model of the world. The Rational Speech Act model (RSA, see e.g. www.problang.org) is a good starting point for modelling this recursive Bayesian reasoning, but to my knowledge has mostly only been used for small scale reference games.
How do I structure a narrative to efficiently inform the listener w.r.t its “point”? This would be an information theoretic criterion such as crossentropy between narrator’s world model/policy and the inferred narratee’s world model/policy. The criterion should motivate a smart teacher to explain complex things simply. (Jointly) minimize the effort spent by both interlocutors (Like here: https://arxiv.org/abs/2005.06641). The criterion could additionally incorporate competing objectives influencing the inclusion and ordering of narrator beliefs such as temporal dependence, relevancy, and the narrator’s confidence in their truth (Gricean Maximes).
How does expecting to have to share information (i.e. belonging to a group) make an agent structure their world model differently? E.g. maybe only group agents even have a self, i.e. narrative about themselves.
If updating other’s policy is a major objective of agent communication, how does policy structure influence the structure of narratives? E.g. if policy has a directed acyclic graphical structure (Bayesian Networks/Markov Decision Processes), is this a reason for narratives to have a similar structure?
An agent may have internal narratives; an agent group may have collective narratives, such as myths. In both cases it seems to lead to more cohesion, i.e. pulling in one direction; even if the narrative is “untrue” to some degree. Do narrative driven agents/groups outperform non-narrative driven agents/groups?
What features do the agents’ environments have to have to answer the last question with “yes”? E.g. Iterated interactions/games, complex environment, group selection, size of group (Does going from a small tribe to a society entail having a more coarse/untrue narrative? Do individuals pay an epistemological tax to belong to the hive? These questions are inspired from some of Joscha Bach’s talks.)
Does a “good” internal narrative (of an individual or group) do more than just compress but actually, by the agents’ acting on it, do something like reduce entropy/become more true?
Do stories as we tell them have an identifiable grammar that can be located in the Chomsky hierarchy roughly, similar to the grammar of sentences? If yes, and a simulation of an agent group freely developing a sentence and story grammar was feasible, how do the agents’ computational capacity, policy structure and environment impact the type of grammars they end up with?
How do I predict whose goals a given story serves? How do I filter out signal from noise when I have little trust in a narrator? Can I expect to successfully filter out signal without knowing the goals of a narrator? It seems information and trust level are a package deal.
When is motivated reasoning appropriate? It is usually in defense of more deeply held beliefs/narratives. I.e.it should be purely a group phenomenon.
From a multi-modal encoder/decoder perspective: Given that you model the world in a way that allows you to tell stories better, and given a shared latent space, how is decoding some representation into a narrative structurally different from decoding it into a sequence of actions? It seems actions can be placed in the Chomsky hierarchy as well; do we perhaps use a sort of grammar templating engine (Broca’s Area?) of some given complexity that has learned grammar rulesets for different domains, like (motor) actions and language?
If cooperative communication is a kind of cooperative game, what kind of game is played in different non-cooperative cases? How well do pragmatics work for each side (speaker/listener) in the case of non-cooperation of each side, and knowledge of non-cooperation?
How do the properties of the language which are used for communication shape the agents’ narratives and policies? (Linguistic determinism, Sapir-Whorf)
Questions for a Theory of Narratives
I kindly ask for your comments, questions and feedback of any kind.
Various predictive and rigorous mathematical (/logical/linguistic) frameworks exist for analyzing and implementing agent’s world MODELLING- especially that of causality (e.g. pearl’s structural causal model) -, their ACTION (e.g. decision theory, reinforcement learning) and some aspects of their COMMUNICATION (information theory; vocabulary, syntax, semantics, pragmatics). I believe one missing link between communication and the former two aspects of intelligent agents is a formal theory of NARRATIVES or stories: At least as far as I am aware, there is no good theory of why we share bits of world model and policy with one another in the way that we do.
This is a very rough dissatisfaction, sort of an itch for having one or several objective functions for narratives that are more specific than very general information theoretic criterions (which are usually agnostic of the structure of world model and policy). I have a myriad of questions that I want a good model of narratives to answer, and think sharing these is the best way to elucidate why I think a theory of narratives—or at least implementable, validatable models of specific aspects of them - is necessary.
The questions below are in not too much particular order, and at the moment still very rough. In fact some questions are so general that I dont expect that they can be answered at all; I still think they are useful/currently the best/only tools I have to make a sketch of the space I want to explore. In any case, I will probably follow up with a post where I try to lay out some terminology more rigorously; and then go into the first two questions more in-depth.
How exactly do narratives relate to causality? They seem to be situational compressions of internal models of causality, with the goal to efficiently inform w.r.t. a part of one’s world model or policy. Narratives thus seem to pick out relevant, relatively simple (acyclic?) subgraphs of a more complex internalized (cyclic?) causal graph.
Is the internal causal graph possibly cyclic due to the nodes (abstractions) grouping multiple individual events (which have a (possibly known) acyclic causal structure) into (classes/ensembles of) objects (that, individually and together, may persist across time), thus having multiple interactions with neighboring nodes (across time and the ensemble), in effect smearing out clear causal directionality until one is left with mere correlation? It seems as one goes higher up in an ontology, one sacrifices causal clarity for abstraction power, and ends up with correlation only. How can an ontology be constructed that respects causal direction optimally (w.r.t. policy communication)? How do you backtrack to a level of abstraction low enough to have a reliable enough causal picture, given what you want to communicate? (It seems that these questions should be treatable with Vapnik-Chervonenkis and Ergodic theory.)
How can a shared vocabulary emerge, and how do the agents validate their interpretations? How does the abstraction level of a given term/concept correlate with the number of different interpretations of their causal connection to other abstract terms by various language users? Does this come from these more coarse terms necessarily grouping together sets of correlates with different causality directions, thereby losing power in distinguishing causality directions? (Recall-precision tradeoff). Why is this a problem in the case of natural language, but not self-referential/mathematical languages?
What’s the “point” of a given story? Why do the interlocutors get it, but literal/formal semantics doesn’t? This is called pragmatics, and hinges on the interlocutors’ model of the world and one another’s model of the world. The Rational Speech Act model (RSA, see e.g. www.problang.org) is a good starting point for modelling this recursive Bayesian reasoning, but to my knowledge has mostly only been used for small scale reference games.
How do I structure a narrative to efficiently inform the listener w.r.t its “point”? This would be an information theoretic criterion such as crossentropy between narrator’s world model/policy and the inferred narratee’s world model/policy. The criterion should motivate a smart teacher to explain complex things simply. (Jointly) minimize the effort spent by both interlocutors (Like here: https://arxiv.org/abs/2005.06641). The criterion could additionally incorporate competing objectives influencing the inclusion and ordering of narrator beliefs such as temporal dependence, relevancy, and the narrator’s confidence in their truth (Gricean Maximes).
How does expecting to have to share information (i.e. belonging to a group) make an agent structure their world model differently? E.g. maybe only group agents even have a self, i.e. narrative about themselves.
If updating other’s policy is a major objective of agent communication, how does policy structure influence the structure of narratives? E.g. if policy has a directed acyclic graphical structure (Bayesian Networks/Markov Decision Processes), is this a reason for narratives to have a similar structure?
An agent may have internal narratives; an agent group may have collective narratives, such as myths. In both cases it seems to lead to more cohesion, i.e. pulling in one direction; even if the narrative is “untrue” to some degree. Do narrative driven agents/groups outperform non-narrative driven agents/groups?
What features do the agents’ environments have to have to answer the last question with “yes”? E.g. Iterated interactions/games, complex environment, group selection, size of group (Does going from a small tribe to a society entail having a more coarse/untrue narrative? Do individuals pay an epistemological tax to belong to the hive? These questions are inspired from some of Joscha Bach’s talks.)
Does a “good” internal narrative (of an individual or group) do more than just compress but actually, by the agents’ acting on it, do something like reduce entropy/become more true?
Do stories as we tell them have an identifiable grammar that can be located in the Chomsky hierarchy roughly, similar to the grammar of sentences? If yes, and a simulation of an agent group freely developing a sentence and story grammar was feasible, how do the agents’ computational capacity, policy structure and environment impact the type of grammars they end up with?
How do I predict whose goals a given story serves? How do I filter out signal from noise when I have little trust in a narrator? Can I expect to successfully filter out signal without knowing the goals of a narrator? It seems information and trust level are a package deal.
When is motivated reasoning appropriate? It is usually in defense of more deeply held beliefs/narratives. I.e.it should be purely a group phenomenon.
From a multi-modal encoder/decoder perspective: Given that you model the world in a way that allows you to tell stories better, and given a shared latent space, how is decoding some representation into a narrative structurally different from decoding it into a sequence of actions? It seems actions can be placed in the Chomsky hierarchy as well; do we perhaps use a sort of grammar templating engine (Broca’s Area?) of some given complexity that has learned grammar rulesets for different domains, like (motor) actions and language?
If cooperative communication is a kind of cooperative game, what kind of game is played in different non-cooperative cases? How well do pragmatics work for each side (speaker/listener) in the case of non-cooperation of each side, and knowledge of non-cooperation?
How do the properties of the language which are used for communication shape the agents’ narratives and policies? (Linguistic determinism, Sapir-Whorf)