Incommensurate thoughts: People with different life-experiences are literally incapable of understanding each other, because they compress information differently.
Analogy: Take some problem domain in which each data point is a 500-dimensional vector. Take a big set of 500D vectors and apply PCA to them to get a new reduced space of 25 dimensions. Store all data in the 25D space, and operate on it in that space.
Two programs exposed to different sets of 500D vectors, which differ in a biased way, will construct different basic vectors during PCA, and so will reduce all vectors in the future into a different 25D space.
In just this way, two people with life experiences that differ in a biased way (due to eg socioeconomic status, country of birth, culture) will construct different underlying compression schemes. You can give them each a text with the same words in it, but the representations that each constructs internally are incommensurate; they exist in different spaces, which introduce different errors. When they reason on their compressed data, they will reach different conclusions, even if they are using the same reasoning algorithms and are executing them flawlessly. Futhermore, it would be very hard for them to discover this, since the compression scheme is unconscious. They would be more likely to believe that the other person is lying, nefarious, or stupid.
If you’re going to write about this, be sure to account for the fact that many people report successful communication in many different ways. People say that they have found their soul-mate, many of us have similar reactions to particular works of literature and art, etc. People often claim that someone else’s writing expresses an experience or an emotion in fine detail.
Incommensurate thoughts: People with different life-experiences are literally incapable of understanding each other, because they compress information differently.
FWIW, this is one of the problems postmodernism attempts to address: the bit that’s a series of exercises in getting into other people’s heads to read a given text.
Yeah. I thought about this a lot in the context of the Hanson/Yudkowsky debate about the unmentionable event. As was frequently pointed out, both parties aspired to rationality and were debating in good faith, with the goal of getting closer to the truth.
Their belief was that two rationalists should be able to assign roughly the same probability to the same sequence of events X. That is, if the event X is objectively defined, then the problem of estimating p(X) is an objective one and all rational persons should obtain roughly the same value.
The problem is that we don’t—maybe can’t—estimate probabilities in isolation of other data. All estimates we make are really of conditional probabilities p(X|D), where D is a person’s unique huge background dataaset. The background dataset primes our compression/inference system. To use the Solomonoff idea, our brains construct a reasonably short code for D, and then use the same set of modules that were helpful in compressing D to compress X.
I want to write about this too, but almost certainly from a very different angle, dealing with communication and the flow of information. And perhaps at some point I will have the time.
Incommensurate thoughts: People with different life-experiences are literally incapable of understanding each other, because they compress information differently.
Analogy: Take some problem domain in which each data point is a 500-dimensional vector. Take a big set of 500D vectors and apply PCA to them to get a new reduced space of 25 dimensions. Store all data in the 25D space, and operate on it in that space.
Two programs exposed to different sets of 500D vectors, which differ in a biased way, will construct different basic vectors during PCA, and so will reduce all vectors in the future into a different 25D space.
In just this way, two people with life experiences that differ in a biased way (due to eg socioeconomic status, country of birth, culture) will construct different underlying compression schemes. You can give them each a text with the same words in it, but the representations that each constructs internally are incommensurate; they exist in different spaces, which introduce different errors. When they reason on their compressed data, they will reach different conclusions, even if they are using the same reasoning algorithms and are executing them flawlessly. Futhermore, it would be very hard for them to discover this, since the compression scheme is unconscious. They would be more likely to believe that the other person is lying, nefarious, or stupid.
If you’re going to write about this, be sure to account for the fact that many people report successful communication in many different ways. People say that they have found their soul-mate, many of us have similar reactions to particular works of literature and art, etc. People often claim that someone else’s writing expresses an experience or an emotion in fine detail.
FWIW, this is one of the problems postmodernism attempts to address: the bit that’s a series of exercises in getting into other people’s heads to read a given text.
Does it work for understanding non-human peoples?
Yeah. I thought about this a lot in the context of the Hanson/Yudkowsky debate about the unmentionable event. As was frequently pointed out, both parties aspired to rationality and were debating in good faith, with the goal of getting closer to the truth.
Their belief was that two rationalists should be able to assign roughly the same probability to the same sequence of events X. That is, if the event X is objectively defined, then the problem of estimating p(X) is an objective one and all rational persons should obtain roughly the same value.
The problem is that we don’t—maybe can’t—estimate probabilities in isolation of other data. All estimates we make are really of conditional probabilities p(X|D), where D is a person’s unique huge background dataaset. The background dataset primes our compression/inference system. To use the Solomonoff idea, our brains construct a reasonably short code for D, and then use the same set of modules that were helpful in compressing D to compress X.
No idea what PCA means, but this sounds like a very mathematical way of expressing an idea that is often proposed by left-wingers in other fields.
Principal Components Analysis
I want to write about this too, but almost certainly from a very different angle, dealing with communication and the flow of information. And perhaps at some point I will have the time.