Schelling Categories, and Simple Membership Tests

Zack_M_Davis26 Aug 2019 2:43 UTC

59 points

Followup to: Where to Draw the Boundaries?

Or there might be social or psychological forces anchoring word usages on identifiable Schelling points that are easy for different people to agree upon, even at the cost of some statistical “fit” …

The one comes to you and says, “That paragraph about Schelling points sounded interesting. What did you mean by that? Can you give an example?”

Sure. Previously on Less Wrong, in “The Univariate Fallacy”, we studied points sampled from two multivariate probability distributions $P_{A}$ and $P_{B}$ , and showed that it was possible to infer with very high probability which distribution a given point was sampled from, despite significant overlap in the marginal distributions for any one variable considered individually.

From the standpoint of “the way to carve reality at its joints, is to draw your boundaries around concentrations of unusually high probability density in Thingspace”, the correct categorization of the points in that example is clear. We have two clearly distinguishable clusters. The conditional independence property is satisfied: given a point’s cluster-membership, knowing one of the $x_{i}$ doesn’t tell you anything about $x_{j}$ for j ≠ i. So we should draw a category boundary around each cluster. Obviously. We might ask hypophorically: what could possibly change this moral?

More constraints on the problem, that’s what!

Suppose you needed to coordinate with someone else to make decisions about these points—that is, it’s important not just that you and your partner make good decisions, but also that you make the same decision—but that each of you only got to observe one coordinate from each point. As we saw, the predictive work we get from category-membership in this scenario is spread across many variables: if you only get to observe a few dimensions, you have a lot of uncertainty about cluster-membership (which carries over into additional uncertainty about the other dimensions that you haven’t observed, but which affect the ex post quality of your decision).

If you and your partner were both ideal Bayesian calculators who could communicate costlessly, you would share your observations, work out the correct probability, and use that to make optimal decisions. But suppose you couldn’t do that—either because communication is expensive, or your partner was bad at math, or any other reason. Then it would be sad if you happened to see $x_{9}$ = 2 and said “It’s an A (probably)!”, and your partner happened to see $x_{27}$ = 3 and said “It’s a B (I think)!”, and the two of you made inconsistent decisions.

Okay, now suppose that there’s actually a forty-first, binary, variable that I didn’t tell you about earlier, distributed like so:

$P_{A} (x_{41}) = {\begin{matrix} 3 / 4 & x_{41} = 0 1 / 4 & x_{41} = 1 \end{matrix}$

$P_{B} (x_{41}) = {\begin{matrix} 1 / 4 & x_{41} = 0 3 / 4 & x_{41} = 1 \end{matrix}$

Observing $x_{41}$ gives you ${log}_{2} 3$ ≈ 1.585 bits of evidence about cluster-membership, which is more than the

$\frac{1 / 4 + 1 / 16}{2} \cdot | {log}_{2} (4) | + \frac{7 / 16 + 1 / 4}{2} \cdot | {log}_{2} (7 / 4) | + \frac{1 / 4 + 7 / 16}{2} \cdot | {log}_{2} (4 / 7) | + \frac{1 / 16 + 1 / 4}{2} \cdot | {log}_{2} (4) |$

≈ 1.18 bits you can get from any one observation of one of the $x_{i}$ for i ∈ {1...40}.

If you and your partner can both observe $x_{41}$ , you might end up wanting to base your shared categories and language on that—calling a point an “A” if it has $x_{41}$ = 0, even though such points actually came from $P_{B}$ a full quarter of the time—even if $x_{41}$ itself has no effect on the quality of your decisions, and what you actually care about is wholely determined by the values of $x_{1}$ through $x_{40}$ ! It’s not the intension you would pick if you could make (and share) more observations—but ex hypothesi, you can’t.

If you and your partner only get to observe one variable, $x_{41}$ is your best choice—the single variable that gives you the most information about the “natural” cluster-membership. That also makes it a Schelling point—if you and your partner didn’t get to commmunicate in advance about how you want to draw your shared category boundaries, you could pick $x_{41}$ as your defining observation and be pretty confident your partner would make the same choice. We could imagine an even more pessimistic scenario in which the Schelling point category definition (a set of variables that “stuck out” from all the others) was less predictive than some other candidates—but if you couldn’t coordinate to pick one of the more predictive category systems, you might be stuck with the Schelling point.

In conclusion, the right categories to use given constraints on communication and observation, might be different from the category boundaries you would draw from a “God’s eye view”, in part because consideration of which categories are easy for different agents to coordinate on is relevant, not just raw information-theoretic expressive power. Thus, “Schelling categories.”

Thanks for reading!

The one says, “No, I meant, like, a real world example, not some dumb math thing for nerds. What is this post really about?”

It’s about … math? Or like, the relationship between math and human natural language? Like, I was wondering what “second-order” caveats or complications there might be to the basic “carve reality at the joints” moral of our standard Bayesian philosophy of language, and some of the people I’ve been collaborating with lately had been talking a lot about the importance of intersubjective epistemology—that is, shared mapmaking, so—

“But where’s the actionable takeaway? What’s your real agenda here, huh?”

Oh. One of those readers, I see. Fine, I can probably think of some—how do you say?—”applications.”

Ummmm …

Let’s see …

Okay, here’s something, maybe. What’s the deal with the age of majority?

Society needs to decide who it wants to be allowed to vote, stand trial, sign contracts, serve in the military, &c. Whether it’s a good idea for a particular person to have these privileges presumably depends on various relevant features of that person: things like cognitive ability, foresight, wisdom, relevant life experiences, &c. In particular, it would be pretty weird for someone’s fitness to vote to directly depend on how many times the Earth has gone around the sun since they were born. What does that number have to do with anything?

It doesn’t! But if Society isn’t well-coordinated enough to agree on the exact prerequisites for voting and how to measure them, but can agree that most twenty-five-year-olds have them and most eleven-year-olds don’t, then we end up choosing some arbitrary age cutoff as the criterion for our “legal adulthood” social construct. It works, but it’s just a legal fiction—and not necessarily a particularly good fiction, as any bright teenagers reading this will doubtlessly attest.

If I told you that a particular fourteen-year-old was very “mature”, that’s a contentful statement: we have shared meaning attached to the word mature, such that my describing someone that way constrains your anticipations. But it’s a really complicated meaning, a statistical signal in behavior that your brain can pick up on, but which isn’t particularly verifiable to others who might have reasons to doubt my character assessment. In contrast, age is easy for everyone to agree on. We could imagine some hypothetical science-fictional Society that used brain scans and some sophisticated machine-learning classifer to determine which citizens get which privileges—but in our dumber, poorer world, calendars and subtraction will have to do.

In terms of Scott Garrabrant’s taxonomy of applications of Goodhart’s law, this is regressional Goodhart: Society wants to select for maturity, chooses age as a proxy, and in the process, ends up granting or withholding privileges that a more discriminating Society maybe wouldn’t.

The age of majority is a case of replacing a complicated, illegible category (“maturity”, the kind of abstract thing you might want to model as a cluster in a forty- or forty-one-dimensional space) with a simple membership test (an age cutoff that everyone knows how to compute). Different people might make make different subjective (but not arbitrary) judgements of the complicated, illegible category, so in order to get a more intersubjectively robust verdict on category-membership, we rely on an objective measurement that everyone can agree on.

If no convenient objective measurement is available, another strategy is possible: we can delegate to some canonical trusted authority, whose opinion of the complicated category will take precdence over everyone else’s. An example of this is commodity grading standards. What is a “Grade AA” egg? Well, there’s a complicated definition written down in a manual somewhere that you could try applying yourself—but for most people, Grade AA eggs are simply “those which have been certified as Grade AA by the USDA.”^[1]

It’s even possible for the “simple objective measurement” and “delegate to an authority’s subjective judgement” strategies to be combined. In “The Ideology Is Not the Movement”, the immortal Scott Alexander writes about his model of the genesis of social groups—

Pre-existing differences are the raw materials out of which tribes are made. A good tribe combines people who have similar interests and styles of interaction even before the ethnogenesis event. Any description of these differences will necessarily involve stereotypes, but a lot of them should be hard to argue. [...] There are subtle habits of thought, not yet described by any word or sentence, which atheists are more likely to have than other people. [...]

The rallying flag is the explicit purpose of the tribe. It’s usually a belief, event, or activity that get people with that specific pre-existing difference together and excited. Often it brings previously latent differences into sharp relief. People meet around the rallying flag, encounter each other, and say “You seem like a kindred soul!” or “I thought I was the only one!” Usually it suggests some course of action, which provides the tribe with a purpose.

Eliezer Yudkowsky’s “A Fable of Science and Politics” depicts a fictional underground society split between two such tribes: an predominantly urban tribe that believes that the unseen sky is blue (and favors an income tax, strong marriage laws, and an Earth-centric cosmology), and predominanty rural one that believes that the sky is green (and favors merchant taxes, no-fault divorce, and a heliocentric cosmology). In this story, beliefs about the color of the sky are functioning as the “rallying flag” for tribe-formation in Alexander’s model—and as a Schelling point for category definition.

We don’t know how to talk about the preëxisting undefinable habits of thought that make social groups work—it’s hard to explicitly articulate what exact statistical regularity our brains have detected in five-and-more-dimensional locale/sky-belief/tax-belief/divorce-belief/cosmology/&c.-space. (Although we could imagine some hypothetical science-fictional Society that did know how to articulate it, and consequently had richer forms of social and political organization than our own.) It’s a lot simpler to talk about whether someone has pledged allegiance to the rallying flag: just ask someone, “What color do you believe the sky is?” (using sky-beliefs as as an “objective” simple membership test), or simply, “Are you a Blue or a Green?” (delegating the classification problem to the person themselves as the authority whose discernment is to be trusted)—and whatever they say, that’s what they are.

Well, probably. We’ve seen that objective measurements like age are subject to regressional Goodhart, but the delegation-to-authority strategy is furthermore subject to adversarial Goodhart: once a category-membership test has been established, some agents might have an incentive to create examples that pass the test, but don’t have the complicated, illegible properties than made the test a useful proxy in the first place.

We’ve seen this, for example, with title inflation: we expect the “job title” (the words that get printed on business cards or immigration sponsorship forms) to be the canonical description of what someone “does”, even if the vagaries of the workday encompass many tasks,^[2] and an alien anthropologist tasked with observing the worksite and summarizing what each of the humans did might slice up her observations into categories with little resemblance to the company’s formal org chart. But since we don’t know how to do the obvious thing and average over all possible alien anthropologists weighted by simplicity, we can only rely on the org chart—which people have political incentives to manipulate, with the result that everyone in the finance industry is a “vice president” of some sort or another.

But “Vice President” has a literal meaning. Or it used to. Vice, “in place of; subordinate to.” President, one who presides over some deliberative body. The adversarial-Goodhart pressures on language “exploit[ ] the trust we have in a functioning piece of language until it’s lost all meaning”.

So for readers who demand a takeaway beyond just an edge case in the math, perhaps take away this: coordination is costly. From the standpoint of language as an AI capability, the social constructions that feeble humans need in order to work together may be unavoidably dumbed-down for mass consumption, but that’s no reason to not aspire to the true precision of the Bayes-structure to whatever extent possible.

(Thanks to Ben Hoffman for the etymology of “Vice President.”)

↩︎
Or the analogous agency in your country.
↩︎
When I worked in a supermarket, two days a week I did Tracy’s bookkeeping/customer-service job while Tracy had her weekend, which entailed counting the money from last night’s tills and swapping in new coinmags and completing the FSM report and answering the phone and selling money orders and covering the floral stand when the floral lady was on lunch, &c. I’m actually not sure what official name this role had in Safeway’s official org chart. We just called it “the booth.”

What links here?

Zack_M_Davis26 Aug 2019 2:43 UTC

59 points

10 comments8 min readLW link

Philosophy of Language