tailcalled comments on Predictive Categories Make Bad Causal Variables

tailcalled 18 Oct 2021 18:41 UTC
1 point
I’m not sure how much I agree/disagree with this post. Potentially relevant: my defense of g here, not sure what you’d make of this.
But what would it mean to intervene on the climate node?
We know that no single factor controls the climate. “Desert” and “rain-forest” are just labels for types or regularities in a weather system. Since climate is an emergent feature, “intervening on climate” means intervening on a bunch of geographic variables.
I don’t really agree that this is a problem for regarding the climate as a causal variable. Counterfactuals don’t have to correspond to interventions you can physically do. Rather, they are thought experiments, “what would it be like if this was different?”, which is something you can do regardless of whether you can intervene in an isolated sense or not.
Insofar as differences in long-term weather trends between locations all share common causes (which get lumped together under ‘climate’), it seems to me that understanding the consequences of these weather trends could often benefit from abstracting them into an overall “climate” variable. Of course sometimes, such as when you are trying to understand how exactly these trends appear, it may be useful to disaggregate them (as long as you can do so accurately! an overly aggregated true model is better than a highly disagregated but highly misleading model). That said, I’m not familiar enough with climate to say much about when it is or is not useful to lump it like this.
A problem we already spotted; thinking of a predictive category (like climate) as a causal variable can lead you to think that you can intervene on climate in isolation from the rest of the system.
I think this is a problem that happens with causality more generally. For instance, consider the appraisal theory of emotion, that how you feel about things is a result of how you cognitively think about them. So for instance, if you feel fear, that is a result of appraising some situation as probably dangerous.
This theory seems true to me. It makes sense a priori. And to give some example, I was once flying on a plane with someone who was afraid of flying. During the flight, the wings of the plane wobbled a bit, and when he saw that he got very afraid that they were going to break and we were going to crash. (Before that, he hadn’t been super afraid, he was relatively calm.) This seemed to be a case of appraisal; looking at wobbly wings, and feeling like they weren’t strong enough and therefore might break, and this might lead to the plane crashing.
So suppose we buy the appraisal theory of emotion. Apparently I’ve heard that this has led to therapies where after disasters have struck and e.g. killed someone’s family, the person’s therapists have suggested that the person should try to reframe their family’s death as something positive in order to improve their mood, which is obviously going to lead to that person feeling that the therapist is crazy/condescending/evil/delusion-encouraging/???. This doesn’t mean that the original causal theory of emotions is wrong, it just means that sometimes you cannot or should not directly intervene on certain variables.
But there’s an even deeper problem. Think back to personality types. It’s probably not the case that there’s an easily isolated “personality” variable in humans. But it is possible for behavior to have regularities that fall into similar clusters, allowing for “personality types” to have predictive power. Focus on what’s happening here. When you judge a person’s personality, you observe their behavior and make predictions of future behavior. When you take a personality quiz, you tell the quiz how you behave and it tells you how you will continue to behave. The decision flow in your head looks something like this (but with more behavior variables):
[image]
All that’s happening is you predict behavior you’ve already seen, and other behavior that has been know to be in the same “cluster” as the behavior you’ve already seen. This model is a valid predictive model (results will vary based on how good your pattern recognition is) but gives weird causal answers. What causes your behavior? Your personality. What causes your personality? Your behavior.
I have a lot of issues with personality tests (e.g. they have high degrees of measurement error, the items are overly abstracted, nobody has managed to create an objective behavioral personality test despite many people trying, the heritability results from molecular genetic studies are in strong contradiction to the results from twin studies, etc.), but I think this is the wrong way to see it.
There’s a distinction between the decision flow in your head and the causal model you propose to follow it, because while causal influence only travels down arrows, correlations travel up and then down arrows. That is, if you have some sort of situation like:
behavior at time 1 ← personality → behavior at time 2
Then this is going to imply a correlation between behavior at time 1 and behavior at time 2, and you can use this correlation to predict behavior at time 2 from behavior at time 1. Thus for the personality model, you don’t need the arrows going into personality, only the arrows going out of personality.
Finally we circle back to essences. You can probably already put together the pieces. Thinking with essences is basically trying to use predictive categories as causal nodes which are the source of all of an entities behavior. This can work fine for predictive purposes, but leads to mishaps when thinking causally.
I don’t think this is a wrong thing to do in general.
Consider for instance the species of an organism; in ancient times, this would be considered an unobservable essence with unobservable effects, such as the physical form and dynamics of the organism. Today, we know the essence exists—namely DNA. DNA is a common underlying cause which determines the innate characteristics that is present in some organism.
In fact, I would argue that causal inference must start with postulating an essentialism, where some hidden unobservable variable causes our observations. After all, ultimately we only observe basic sense-data such as light hitting our eyes; the postulation that this sense-data is due to a physical world that generates it is completely isomorphic to other forms of essentialism that propose that correlations in variables are due to some underlying hidden essence. Without this essentialism, we would have to propose that correlations in sense-data over time is due to the earlier sense-data directly influencing later sense-data, which seems inaccurate. So I’d say that in a way, essentialism is the opposite of solipsism.
More generally, I think essentialism is a useful belief whenever one doesn’t think that one is observing all the relevant factors.
- Hazard 18 Oct 2021 21:08 UTC
  2 points
  Parent
  It’s not clear to me what if anything we disagree on.
  I agree that personality categories are useful for predicting someone’s behavior across time.
  I don’t think using essences to make predictions is the “wrong thing to do in general” either.
  I agree climate can be a useful predictive category for thinking about a region.
  My point about taking the wrong thing as a causal variable “leading you to overestimate your ability to make precise causal interventions” is actually very relevant to Duncan’s recent post. Many thought experiments are misleading/bogus/don’t-do-what-they-say-on-label exactly because they posit impossible interventions.
  - tailcalled 18 Oct 2021 21:32 UTC
    1 point
    Parent
    If I had to pick a core point of disagreement, it would be something like:
    I believe that if you have a bunch of different variables that are correlated with each other, then those correlations are probably because they share causes. And it is coherent to form a new variable by adding together these shared causes, and to claim that this new variable is an underlying factor which influences the bunch of different variables, especially when the shared causes influence the variables in a sufficiently uniform way. Further, to a good approximation, this synthetic aggregate variable can be measured simply by taking an average of the original bunch of correlated variables, because that makes their shared variance add up and their unique variances cancel out. This holds even if one cannot meaningfully intervene on any of this.
    I have varying levels of confidence in the above, depending on the exact context, set of variables, deductions one wants to make on the basis of common cause, etc., but it seems to me like your post is overall arguing against this sentiment while I would tend to argue in favor of this sentiment.