An Undergraduate Reading Of: Semantic information, autonomous agency and non-equilibrium statistical physics
This is a recent paper by Artemy Kolchinsky and David H. Wolpert, from the Santa Fe Institute. It was published in The Royal Society Interface on Oct 19. They propose a formal theory of semantic information, which is to say how to formally describe meaning. I am going over it in the style proposed here and shown here, approximately.
I will go through the sections in-line at first, and circle back if appropriate. Mostly this is because when I pasted the table of contents it very conveniently kept the links to the direct sections of the paper, which is an awesome feature.
Abstract: as is my custom, skipped.
Semantic information is meaningful to a system, as distinct from syntactical information, which is correlational.
They cite the importance of the idea in these fields: biology, cognitive science, artificial intelligence, information theory, and philosophy.
Question one: can it be defined formally and generally?
Question two: can that definition be used in any physical system (rocks, hurricanes, cells, people)?
They claim the answer to both questions is yes. They define semantic information:
the information that a physical system has about its environment that is causally necessary for the system to maintain its own existence over time
Most of the time we study syntactic information, using Shannon’s theory.
Shannon explicitly avoided addressing what meaning a message over a telecommunication line might have.
One approach to address this is to assume an idealized system that optimizes some function, e.g. utility.
Under this approach, semantic information helps the system achieve its goal.
The problem with it is the goal is defined exogenously. Therefore meaning is to the scientists who impute goals to the system, not the system itself.
We want meaning based on the intrinsic properties of the system.
In biology the goal of an organism is fitness maximization, which leads to the teleosemantic approach, which roughly says a trait has meaning if at some time in the past it correlated with states of the environment (and therefore had a bearing on fitness).
Example: frogs snap their tongues at black spots in their visual field. This is semantic information because eating flies is good for frogs, and correlated with flies in the past.
The problem with teleosemantics is it defines meaning in terms of the past history of the system; an ahistorical definition that relies only on the dynamics of the system in a given environment is the goal.
Another approach is autonomous agents, which maintain their own existence in an environment. This has self-preservation as the goal, and does not rely on history.
Autonomous agents get information about the environment, and then respond in ‘appropriate’ ways. Example:
For instance, a chemotactic bacterium senses the direction of chemical gradients in its particular environment and then moves in the direction of those gradients, thereby locating food and maintaining its own existence.
Research suggests the information used for self-maintenance is meaningful, but this concept has remained informal. In particular, there is no formal way to quantify the semantic information an agent has, or to determine the meaning of a particular state.
Their contribution:
We propose a formal, intrinsic definition of semantic information, applicable to any physical system coupled to an external environment
A footnote here says that the method should generalize to any dynamical system, but they focus on physical ones in the paper. This is an interesting claim to me.
There is ‘the system X’ and ‘the environment Y’; at some initial time t = 0, they are jointly distributed according to some initial distribution p(x0, y0); they undergo coupled (possibly stochastic) dynamics until time τ, where τ is some timescale of interest.
There is a viability function, which is the negative Shannon entropy of the distribution over the states of system X. This quantifies the ‘degree of existence’ at any given time. More information about the viability function in section 4.
Shannon entropy is used because it provides an upper bound on the probability of states of interest; it also has a well-developed connection to thermodynamics, which links them to non-equilibrium statistical physics.
Semantic information is a subset of syntactic information which causally contributes to the continued existence of the system. This maintains the value of the viability function.
They draw from Pearl, and use a form of interventions in order to quantify:
To quantify the causal contribution, we define counter-factual intervened distributions in which some of the syntactic information between the system and its environment is scrambled.
Figure 1 has some graphical examples.
They give three verbal examples for the scrambling procedure: switching rocks between fields, switching hurricanes between oceans, and switching birds between environments. This section is a little suspect to me; the hurricane and the rock were described as “low viability value of information” when scrambling consisted of putting them in very similar environments, but then the bird was “high viability value of information” when scrambling put them in random environments. Further, the rock and bird were on year timelines, and the hurricane only an hour. This might just be sloppy explanation. In the main, I would expect the lifespan of a system to be inversely correlated with viability value of information overall, so I would have thought hurricane>bird>rock.
They use ‘coarse-graining’ methods from information theory to formalize transforming the actual distribution into intervened distributions.
The intervention which has the same (or greater?) viability as the actual distribution, but has the least syntactical information, is called the viability-optimal intervention.
They interpret all of the syntactic information of the optimal intervention to be semantic information, because any further scrambling changes the viability.
Semantic efficiency is the ratio of semantic to syntactic information. It quantifies how tuned the system is to only gather information relevant to its existence.
Semantic content of a system state x is the conditional distribution, under the optimal intervention, given state x. This can tell us the correlations relevant to maintaining the system.
They claim to be able to do point-wise semantic information as well.
The framework is not tied to the Shannon notion of syntactic information; from different measures of syntactic information they can derive appropriate measures of semantic information, e.g. thermodynamics through statistical physics.
Measures of semantic information are defined relative to the choice of:
(1) the particular division of the physical world into ‘the system’ and ‘the environment’;
(2) the timescale τ; and
(3) the initial probability distribution over the system and environment.
They suggest implications for an intrinsic definition of autonomous agency.
2. Non-equilibrium statistical physics: Body—skipped for now.
3. Preliminaries and physical set-up: Body—skipped for now.
4. The viability function: Body—mostly skipped, but I did go in to find the actual function:
define the viability function as the negative of the Shannon entropy of the marginal distribution of system x at time τ,
5. Semantic information via interventions: Body—skipped for now.
6. Automatic identification of initial distributions, timescales and decompositions of interest: Body—skipped for now.
Semantic information is syntactic information that is causally necessary for the system to continue.
It can be stored (mutual information between system and environment) and observed (transfer entropy exchanged between system and environment).
Semantic information can misrepresent the world. This shows up as a negative viability value.
Semantic information is asymmetrical between system and environment.
No need to decompose the system into different degrees of freedom (sensors/effectors, body/brain, membrane/interior).
Side-steps the question of internal models or representations entirely.
The framework does not assume organisms, but it may be useful for offering quantitative and formal definitions of life.
They suggest that high semantic information may be a necessary, though not sufficient, condition for being alive.
Note: I have left the links below for completeness, and to make it easy to interrogate the funding/associations of the authors. The appendices have some examples they develop.
End: I am putting this up before delving into the body sections in any detail, not least for length and readability. If there is interest, I can summarize those in the comments.
While writing up this post, the Embedded Agents post went up. It seems like this work could be conceptually relevant to three of the four areas of interest identified in that post, with Embedded World-Models being the odd man out because they explicitly skip the question of internal models.
Looking at this paper again in that vein, I am immediately curious about things like whether we can apply this iteratively to sub-systems of the system of interest. It seems like the answer is almost certainly yes.
I also take it more-or-less for granted that we can use these same ideas to define semantic information in relation to some arbitrary goal, or set of goals. It seems like putting the framework in an information-theoretic context is very helpful for this purpose. It feels like there should be some correspondence between the viability function and the partial achievement of goals.
Leaning on the information-theoretic interpretation again, I’m not even sure it would require any different treatment to allow for non-continuous continuation of the system (or goal). This way things like the hibernation of a tardigrade, hiring a contractor at a future date, and an AI reloading itself from backup are all approachable.
But the devil is in the details, so I will table these speculations until after seeing whether the rest of the paper passes the sniff test.
This is quite interesting. There’s a lot of attempt to figure out how we get meaning out of information. I think of it mostly in terms of how consciousness comes to relate to information in useful ways, but that makes sense to me mostly because I’m working in a strange paradigm (transcendental phenomenological idealism). But I think I see a lot of the same sort of efforts to deal with meaning popping up in formal theories of consciousness even if meaning is the exact thing that those are driving at; I just see the two as so closely tied it’s hard to address one without touching on the other.