Just a quick note from a Twitter/X discussion.

It’s an almost-universal scientific rule that empirics is blind without a model. The left picture below shows a bunch of points, and a regression line fitting them. It seems like a good fit. But why should we believe it? The right picture shows the same points with a curve that fits them perfectly. We prefer the left hand line because of some theory like “these two variables are probably linearly related, and also y is partly random”. That theory is embodied in an equation like y = a + bx + ε where b is the slope of the line and ε is random noise which makes the fit imperfect. Without that, we’d have to accept that the stupid right hand curve fits the data better.

Thirty points and two alternative models to explain them

Heritability is the proportion of variation in something that can be explained by people’s genes. (For social scientists: imagine doing the perfect regression on the outcome, using all relevant genetic variables. Heritability is the R-squared.¹) The classic way you estimate heritability is by comparing identical to fraternal (non-identical) twins. Identical twins share all their genetic variation, fraternal twins share only half of it. Assuming no assortative mating! If people marry people with similar genes then fraternal twins and other siblings will have more than half their genes alike.

So you can back out heritability from that. If identical twins are exactly alike on the variable of interest, with a correlation of 1, and fraternal twins are correlated at 0.5, then the heritability is 1. More generally, the heritability is twice the difference in correlations.

Twin studies are genius because they leverage theory. We know people get half their genes from each parent, that in fraternal twins the genes are drawn independently, and in identical twins they are exactly the same. Without this insight, there would be no way to separate the effects of parental environment and genetics. Genes come from your parents, so does family background. A study of parents and children couldn’t disentangle these effects. The theory let scientists disentangle genes and family environment, even before it was practical to measure people’s DNA variants.

But are twin studies right? What if identical twins grow up in a more similar environment than fraternal twins? That seems plausible because maybe parents treat them more similarly. Then again, what if there is assortative mating? Because, like, there is. Then fraternal twins are more like identical twins, sharing more than half their genes, and you need to multiply the difference in correlations by more than two.

So there’s a few recent papers trying to get estimates on heritability by using information beyond twins.

This paper reckons that twin studies underestimate the heritability of education, because of assortative mating. But they also show^[1] that many different models fit the data equally well. Heritability is 36-39% or 51-56%, depending on which different kind of assortative mating you believe in.
This paper argues that twin studies overestimate the heritability of education, because identical twins actually grow up in more similar environments than fraternal twins. Heritability is only 9%! They identify the environmental difference between the kinds of twins by looking at twins’ spouses and children. It’s a bit hard to explain. I think there is an assumption that twins’ nephews share similar environments with twins’ spouses nephews, and somehow this assumption holds the model together, like a strategically-located pin in a complicated brassière.
This paper uses not twins, but remote cousins, to estimate heritability of education. They get very low estimates (only 7%) and find that environmental transmission is better at explaining correlations of outcomes between remote cousins. But they don’t allow for assortative mating except via (a) observed education and (b) cultural transmission. I’m doubtful of that because there’s evidence for much more assortative mating on genetics than could be explained by mating on observed education alone.^[2]

You may think from these numbers that despite a lot of effort, researchers are still far from consensus about how heritable education is, and that the different answers seem to come from the assumptions baked into different models.

Five toy worlds

To illustrate what I feel about this literature, here are some toy examples.

In world 1 there are two genetic types: Wordcels W and Shape Rotators S. Shape Rotators all go to university, Wordcels never do. Decline of the humanities. Very sad. Parents can’t do anything about it. However hard you beat your Wordcel brat, he’ll never amount to anything. The heritability is 100%. If we run a twin study, all identical twins have the same outcomes, and the correlation of fraternal twins is 50%.

In world 2 there are two genetic types: Wordcels W and Shape Rotators S. But in this world genetics don’t get you to university. All that matters is if you beat your children. Children who are beaten⁴ go to university. Children who are spoiled (not beaten) do not. Parents never beat their W children but always their S children, so only S’s go to university.

Zeiss planetarium, Jena — The Zeiss planetarium in Jena: a toy world to help us understand the real world

This world looks exactly like the previous world, unless you measure what parents do. This is our first problem. If children’s environment correlates perfectly with their genetics, then we can’t tell which matters. If it correlates imperfectly, then we can tell, but we will be estimating it based on very few people. (Suppose 1% of parents are contrarians who only beat their Wordcel children. Then you can find out whether genetics or beatings matter, but only from 1% of your sample. And maybe those guys are weird in other ways.)

What’s the heritability in world 2? You might swiftly answer 0%, because genetics don’t get you to university. But this ignores how the child’s genetics is affecting its parents! Consider:

World 2a: parents always beat their S children because they are S types.

A twin study in this world will show that heritability is 100%: either identical twins are both S, both get beaten and both go to university, or they are both W, don’t get beaten and don’t go to university.

World 2b: parents always beat their S children, but this is not causal. Instead, beating is a family tradition which is handed down along with S genes. To keep things simple, let’s say there is perfect assortative mating by the S and W type.

A twin study in this world won’t work because all siblings always have the same type due to the assortative mating. But heritability is 0%. All that matters here is the cultural tradition. Genes are along for the ride.

Since heritability in world 2a is 100%, does that mean there is no way to get more people to university? Not so fast! Maybe we could launch a public health campaign to encourage parents to beat their Wordcel. To think that idea through, let’s consider:

World 3: Only beaten children go to university; genes don’t matter. Parents beat their children because they are S types. Specifically, they falsely believe that a W type would not go to university, even if beaten. So they prefer not to beat them.

In this world, even though heritability is 100%, the public health campaign would work in its stated aim of getting more children to university. The parents are wrong!

World 4: parents beat their children because they are S types. Specifically, they have the following true beliefs about the world: beaten S types go to university; unbeaten W types go to university half the time. Also, an S child who isn’t beaten gets lazy, plays computer games and follows Andrew Tate on YouTube. A W child who is beaten gets upset and starts a Tumblr. Neither of these ever go to university.

Measured heritability is now lower than 100% (because some W types go to university) but in this world, the public health campaign would not work. In fact it would backfire in its stated aim: beaten Wordcels would go to university less. There is a gene-environment interaction between W-S genetics and beatings; the parents know this and respond to it, leading to an evocative gene-environment correlation.

Worlds 3 and 4 are observationally identical. But they have very different implications! In world 3, you can improve the world by encouraging parents to beat their Wordcel children. In world 4, you can’t: if you tell parents to stop beating their S child, or to beat their W child, you just make things worse for the outcome you care about. Parents are optimizing.

So if the public health campaign works, everything is fine?

World 5: parents beat their children because they are S types. Specifically, they have the following true beliefs about the world: all beaten children go to university. But, while S types aren’t harmed by being beaten, W types are seriously harmed by it. The parents care about their children’s welfare, so they don’t beat W types even though that would get them to university.

Here, just like world 3, heritability is 100% and yet the public health campaign “works”. But that depends what you count as working. The parents have knowledge about their children’s welfare that the social scientist does not observe. The public health campaign reduces welfare, even though it fulfils its stated goal.

More or less controversial suggestions

Here are my conclusions from these toy worlds. I don’t think they’re new to thoughtful researchers in the field, but equally I don’t see people acting on them, so perhaps they are worth stating.

When you are predicting individual outcomes of genes and environments, it is important to be able to maintain assumptions about how things work. Without some assumptions, you are not going to have enough data to draw conclusions. The classic example is the assumption of twin studies that identical/fraternal twins share all/half their DNA. This can be wrong, but it comes from rigorous theory, and we know what to do about it when it is wrong.

Unfortunately, when genes and environments interact and correlate, that assumption about DNA won’t get you everything you need. We need other assumptions. Also unfortunately, theories as solid and tested as that of the mechanisms of genetic inheritance are rare. (We don’t know precisely how cultural inheritance works, for example.)

When different papers are deriving widely different estimates of heritability from quite similar data, it’s time to check and foreground the assumptions built into their models.

In the struggle to pin down parameters, purely environmental sources of variation are a valuable resource.

I am especially thinking of within-family sources of environmental variation. Our paper used two: birth order and parental age. These are often widely available, so you’re not dealing with a small, unusual sample.

The point of those sources is that you can fix a parameter to zero. The correlation of birth order with genetics is zero within families. Ditto for parental age.^[3]

Economic models also offer a powerful, though controversial, way to pin down a model.

What defines economic theory is: somebody optimizes something. Like purely environmental variation, that narrows down what is possible: any variable that is the result of an individual’s choice must be the result of that individual maximizing some utility function. Unfortunately, the form of people’s utility functions is not as precisely known as the way genes are inherited, and it is also more controversial if people really optimize anything.

Nevertheless, economic models put some structure into your world, and in a way that ties in to a big existing body of theory and empirics.

In particular, economic models imply that each class of actor, as defined by the preferences and resources they possess, will do the same thing. This reduces the amount of variation we observe, and it also means that the variation we do observe is not a random sample of all the possibilities. Rather, it is defined by people doing what is best for them. For example, this might imply that parents, and/or children themselves, match environments to genetics in a way they expect to be optimal. That makes it harder for the scientist to disentangle the effects of the two; it also should put us on our guard for hidden variables, in cases where genes and environments seem not to match.

The framework in genetics of accounting for observed variation is not necessarily helpful in guiding us towards the right questions. Worlds 3 and 4 have the same observed data. But their causal structure is different. And so the results of counterfactual interventions would be different, and policy recommendations should be different also.

One advantage of economic theory is that it can help with counterfactual predictions. For example, suppose that parents are maximizing their children’s education, and that they correctly predict the outcomes of their actions. We rule out world 3. Of course, that ruling out is coming from theory not data! Maybe parents are maximizing something else. Nevertheless, trying to write down what the analyst thinks they aremaximizing may help him to understand bits of the world that are hard to measure directly.

So, I’m trying to push two ideas: (a) seek within-family sources of variation of the child’s environment and (b) build an economic theory of what parents do into your genetically-informed family study. I think (a) is probably less controversial than (b), and probably for good reasons. But if you put them together, you might get something like this:

A model of how parents allocate their parenting time/effort, given their number of children, and what they observe about each child’s phenotype.
Given that, a prediction about the child’s environment within a given family, partly based on birth order and parental age (independent of genetics)
Empirics on how genes and environment interact in producing child outcomes, using birth order and parental age to separate the effects of environment from genetics.

(Open question: do we also consider parents’ choice to have (more) children? That would be a stretch goal.)

Done right, economic models have another advantage. Because people in the models have preferences, the models give us ways to measure social welfare. Actually, you can flip this around. If as a result of some empirical analysis, like a family or twin study, someone makes a policy recommendation, then they must have an implicit definition of social welfare. Economic models make the definition explicit.

World 5 gave an example where parents are effectively maximizing child welfare, in ways that are not necessarily visible to the scientist. That is not always, or even typically true, in the economic framework. Individuals may be maximizing their own welfare, but an intervention could still improve things because of e.g. externalities. (Think of a world where lazy parents beat their children more than is socially optimal, because responsible parenting takes time and effort.) But in economic models, there is always a relationship between decision and welfare. People in the model have goals which they try to achieve; the aim of policy is to help them. If people trade off e.g. educational achievement and emotional welfare, then the policy goal must respect that. Paternalism is not allowed!

So this gives the final outcome you might shoot for:

Counterfactual predictions and welfare analysis, using the model, for changes to parental effort allocation, based on the idea that parents are optimizing, not just doing stuff at random and observe facts the analyst does not.

The above is all very ambitious; doing even part of it might be a useful contribution.

^
In Table 4.
^
Social science datasets typically record whether you went to university, but not e.g. which university you went to or which degree you studied. So this is pretty coarse data!
^
In fact, you could challenge either of these! Father’s age actually does affect genetic mutation but I think that for many cases, the effects are small enough to be negligible; if people choose how many children to have based on their prior children, then birth order could correlate with genetics, but I’ve never seen evidence for it. Just like for twin studies, assumptions are rarely cast-iron, but some are stronger than others.

Five toy worlds to think about heritability

Five toy worlds

More or less controversial suggestions