I would be really interested to know what conclusion you have made about inferential distance.
Jennifer is suggesting that these ideas could be used to quantify inferential distances. A first attempt might be to say that a Speaker and a Listener are separated by a large inferential distance when the Speaker has a much larger value for P(U|T) than the Listener does.
There seems to me to be something important left out, though. I take inferential distances to be about differences in the plausibility of a conclusion to different people. Even if you understand my claim perfectly (ie, you’ve mapped my U to the proper T) you might still consider T to be almost certainly wrong, while I consider T to be an inevitable conclusion of self-evident premises, even if it takes a long chain of inferences to get to the conclusion from the premises.
Speaking from my personal experience, when I as a listener had problems accepting a conclusion which was considered natural and perhaps obvious by the speaker, it was rarely because I misinterpreted the meaning (now I am apeaking about conclusions which I have accepted as obvious later, so that I can judge whether I understood what has been said earlier). The reason was rather that I lacked some background knowledge or thinking habits which caused my P(T) being low, not P(U|T).
It seems that “exercises left for the reader” are not generally well taken. Arundelo and Kaj are correct in that I meant that sort of as a joke and sort of as an invitation to conversation with a capable audience. This is LW right? So the audience should be capable :-)
I liked this link and quote because it seemed so productive towards mechanization and experimentation of concrete issues around the relatively more hand-wavey concept of inferential distance. I’m not sure how such a research program would turn out, but the quote makes it more plausible to me that researchers could get traction if they dug in.
The link suggests that software already exists in rudimentary form (and might be developed with better calibration specifically for this issue) to operate on digital text to characterize the bits of information it contains. These bits are situated as measurements mathematically related to hypothesized bayesian distributions over human communication intent and sense of model plausibility.
It doesn’t seem that hard to imagine a program of study around the issue, with efforts to refine text compression software so that it gives “the right answer” when applied to text generated by experimental human subjects encouraged to communicate about toy problems explained to one subject, communicated to another in the course of the experiment, and then validated to have been successfully transmitted with a comprehension test applied to the second subject.
Perhaps geometrical diagrams could be serialized and measured this way somehow? It would be interesting to use simple pictures as the “T” in part because the the bit about “”a prior P(T) over the possible thoughts we may be conveying” is obviously important but spelling out the details might be hard. I think some people would be tempted to use “language in one’s head” as the model for thoughts (so U and elements of T are directly comparable via trivial methods), but starting out with “probability distributions over possible visual representations” seems likely to avoid a local research optimum where only “language focused people” think the results are very general.
You could also just use the existing software on existing text to try to predict inferential distance between existing domain experts (religious texts and priests from different religions? science texts and academics from different departments?) trying see if the software’s numbers predict “something about interactions” to measure that would reveal “false inferential distance assumptions” if they really existed. If they do you’d have tools and details for picking apart the concept.
Assuming it exists, what if the expectation of short inferential distances wasn’t something caused by the fact that we evolved up in small tribes with mostly common knowledge, but instead grew out of simple planning fallacies about how easy it is to teach something? Or maybe it changes based on one’s experience of cultural homogeneity and some people are wrong in the other direction based on many experiences with people of radically different beliefs and no inclination to spend the time that would be required to update with them?
Those are just off the top of my head, but they are the kinds of things that came to mind when I thought about LW while reading in language log. I was hoping for some responses along the lines of “oh hey that’s helpful, it makes me think of X” :-)
One of the things I hate in mathematical textbooks are proofs left as exercises for the reader.
I would be really interested to know what conclusion you have made about inferential distance.
Jennifer is suggesting that these ideas could be used to quantify inferential distances. A first attempt might be to say that a Speaker and a Listener are separated by a large inferential distance when the Speaker has a much larger value for P(U|T) than the Listener does.
There seems to me to be something important left out, though. I take inferential distances to be about differences in the plausibility of a conclusion to different people. Even if you understand my claim perfectly (ie, you’ve mapped my U to the proper T) you might still consider T to be almost certainly wrong, while I consider T to be an inevitable conclusion of self-evident premises, even if it takes a long chain of inferences to get to the conclusion from the premises.
Speaking from my personal experience, when I as a listener had problems accepting a conclusion which was considered natural and perhaps obvious by the speaker, it was rarely because I misinterpreted the meaning (now I am apeaking about conclusions which I have accepted as obvious later, so that I can judge whether I understood what has been said earlier). The reason was rather that I lacked some background knowledge or thinking habits which caused my P(T) being low, not P(U|T).
You may already know this, but:
And remember, if that one doesn’t strike your fancy, you can always employ one of these alternative ways for proving your result.
It seems that “exercises left for the reader” are not generally well taken. Arundelo and Kaj are correct in that I meant that sort of as a joke and sort of as an invitation to conversation with a capable audience. This is LW right? So the audience should be capable :-)
I liked this link and quote because it seemed so productive towards mechanization and experimentation of concrete issues around the relatively more hand-wavey concept of inferential distance. I’m not sure how such a research program would turn out, but the quote makes it more plausible to me that researchers could get traction if they dug in.
The link suggests that software already exists in rudimentary form (and might be developed with better calibration specifically for this issue) to operate on digital text to characterize the bits of information it contains. These bits are situated as measurements mathematically related to hypothesized bayesian distributions over human communication intent and sense of model plausibility.
It doesn’t seem that hard to imagine a program of study around the issue, with efforts to refine text compression software so that it gives “the right answer” when applied to text generated by experimental human subjects encouraged to communicate about toy problems explained to one subject, communicated to another in the course of the experiment, and then validated to have been successfully transmitted with a comprehension test applied to the second subject.
Perhaps geometrical diagrams could be serialized and measured this way somehow? It would be interesting to use simple pictures as the “T” in part because the the bit about “”a prior P(T) over the possible thoughts we may be conveying” is obviously important but spelling out the details might be hard. I think some people would be tempted to use “language in one’s head” as the model for thoughts (so U and elements of T are directly comparable via trivial methods), but starting out with “probability distributions over possible visual representations” seems likely to avoid a local research optimum where only “language focused people” think the results are very general.
You could also just use the existing software on existing text to try to predict inferential distance between existing domain experts (religious texts and priests from different religions? science texts and academics from different departments?) trying see if the software’s numbers predict “something about interactions” to measure that would reveal “false inferential distance assumptions” if they really existed. If they do you’d have tools and details for picking apart the concept.
Assuming it exists, what if the expectation of short inferential distances wasn’t something caused by the fact that we evolved up in small tribes with mostly common knowledge, but instead grew out of simple planning fallacies about how easy it is to teach something? Or maybe it changes based on one’s experience of cultural homogeneity and some people are wrong in the other direction based on many experiences with people of radically different beliefs and no inclination to spend the time that would be required to update with them?
Those are just off the top of my head, but they are the kinds of things that came to mind when I thought about LW while reading in language log. I was hoping for some responses along the lines of “oh hey that’s helpful, it makes me think of X” :-)