So as I said before, you changed my mind but—now that I’ve had some time to mull it over, I’ll articulate why instinctively I felt something wrong about the big-five...because everything you say is true, but I still have a nagging feeling of wrongness. I’ve put the more important parts in >quotation format, since these are unverbalized thoughts still in formation and tend to over-ramble and I don’t have time to properly pare it down at the moment.
Back when I first looked at the questions to formulate the Big 5 (my memory indicates a much longer questionnaire than what see when I google it now) I was bothered by questions such as “I tend to tell people when I disagree with them” and “I like to make people comfortable” both being held under the same variable (agreeableness).
It seemed to me like there were many underlying factors at play in how someone might answer such a question: “aversion to causing negative emotions in others” (niceness) and “aversion to drawing anger towards oneself” (conformity) competing with “aversion to white lies”, such that an honest-to-a-fault person would be behaviorally similar to a jerk, such that kindness is behaviorally similar to weakness. I had a similar pattern of objection for conscientiousness (sense of responsibility, preference for order, attention to detail, willpower), extroversion (Preference for companionship, social skills, lack of social anxiety, general energy level).
So then, I think the “general form” of my objection is:
“Factors X and Y are unrelated, but X and Y both contribute to factor P. You’re creating a Construct C, measuring X, Y, and P separately, and tossing them all in the C bucket, and that annoys me.”
Why did it annoy me? Well, it’s mostly because in some cases, we called Construct C by a name that I felt aught to be reserved for Factor X. For example, I intuitively feel “Extrovert” aught to be reserved for “the degree to which one prefers comfortable social situations to comfortable solo situations” and should not be entangled with social skills.
I think this “tangling” of variables which intuitively bugs me is why all the factors correlate (except for neuroticism, which inversely correlates). To give one example, someone who is simultaneously extroverted (as defined by me, above) and neurotic will have depressed extroversion scores because they’re nervous and unskilled in social situations. They’re not introverted, they’re just isolated, and the Big 5 can’t tell the difference. The human reader will go on to conclude that introverts are inherently more neurotic. (I can give several other examples in this format for the other constructs, if you don’t agree with this particular one).
Here’s an example of this happening “in the wild”: Here’s someone talking about the dark side of creativity, and they reference this study showing that creatives are lower on the honest-humility factor on the HEXACO.
Now, what do you think a highly creative, original thinker would answer on a HEXACO question like “People sometimes tell me that I am too critical of others”? Do they voice criticism because they are arrogant meanies who don’t care about hurting others, or is it because they’re nonconformists who value honesty over the warmth of mutual agreement and are simply treating others as they want to be treated themselves? Obviously the latter, right? But in the data, that’s going to come out as “Scoring low on Honesty-Humility”. (This sort of thing is why I suggested that Agreeableness ends up collapsing both kindness and conformity).
Anyway, my previous dismissal of the Big 5 was wrong. It’s very good at what it claims to do, which is separate out factors which lead to certain patterns of responses. From the algorithms point of view, it doesn’t matter if X and Y are uncorrelated, they still cluster together because of the relationship to P and fold into Construct C. And it is useful to separate out broad patterns of responses into broad constructs even if those constructs contain totally uncorrelated factors, because there is no way you could separate out all the uncorrelated factors anyhow.
Still, that’s why it bugs me. I feel intuitively more satisfied when I look at tests which attempt to myopically measure uncorrelated factors individually and totally ignore the other factors, even though those tests probably won’t be as good at catching broad swathes of behavior like the Big 5 can. People do actually feel like creative smart folks are arrogant and narcissistic meanies because of the behavioral overlap (which is why Lesswrong-ish types are often accused of arrogance and write posts delineating proper humility from “false humility”), and the Big 5 does capture things like that. That instinctively bugs me because it’s conflating things in the same way human beings do, but on the other hand it’s conflating things in the same pattern that humans conflate things, which is actually a pretty big achievement for an algorithm now that I’ve spent more time thinking about it.
Considering all of the above, I think it would stop bugging me if we had more precise labeling for the names of the five factors. The trouble I have is that it’s just too easy to conflate here, the researchers are human to begin with and already prone to conflation, and then when we imprecisely label the Big 5 factors we only facilitate that conflation. (And maybe the original researchers had this danger in mind, when they labeled it Agreeableness instead of “Niceness”, hoping that people would realize that compliance is a factor as much as altruism, but...)
Does that make sense / still sound wrong to you? (If you get to it, sorry it’s not more concise.)
So as I said before, you changed my mind but—now that I’ve had some time to mull it over, I’ll articulate why instinctively I felt something wrong about the big-five...because everything you say is true, but I still have a nagging feeling of wrongness. I’ve put the more important parts in >quotation format, since these are unverbalized thoughts still in formation and tend to over-ramble and I don’t have time to properly pare it down at the moment.
It seemed to me like there were many underlying factors at play in how someone might answer such a question: “aversion to causing negative emotions in others” (niceness) and “aversion to drawing anger towards oneself” (conformity) competing with “aversion to white lies”, such that an honest-to-a-fault person would be behaviorally similar to a jerk, such that kindness is behaviorally similar to weakness. I had a similar pattern of objection for conscientiousness (sense of responsibility, preference for order, attention to detail, willpower), extroversion (Preference for companionship, social skills, lack of social anxiety, general energy level).
I think this “tangling” of variables which intuitively bugs me is why all the factors correlate (except for neuroticism, which inversely correlates). To give one example, someone who is simultaneously extroverted (as defined by me, above) and neurotic will have depressed extroversion scores because they’re nervous and unskilled in social situations. They’re not introverted, they’re just isolated, and the Big 5 can’t tell the difference. The human reader will go on to conclude that introverts are inherently more neurotic. (I can give several other examples in this format for the other constructs, if you don’t agree with this particular one).
Anyway, my previous dismissal of the Big 5 was wrong. It’s very good at what it claims to do, which is separate out factors which lead to certain patterns of responses. From the algorithms point of view, it doesn’t matter if X and Y are uncorrelated, they still cluster together because of the relationship to P and fold into Construct C. And it is useful to separate out broad patterns of responses into broad constructs even if those constructs contain totally uncorrelated factors, because there is no way you could separate out all the uncorrelated factors anyhow.
Still, that’s why it bugs me. I feel intuitively more satisfied when I look at tests which attempt to myopically measure uncorrelated factors individually and totally ignore the other factors, even though those tests probably won’t be as good at catching broad swathes of behavior like the Big 5 can. People do actually feel like creative smart folks are arrogant and narcissistic meanies because of the behavioral overlap (which is why Lesswrong-ish types are often accused of arrogance and write posts delineating proper humility from “false humility”), and the Big 5 does capture things like that. That instinctively bugs me because it’s conflating things in the same way human beings do, but on the other hand it’s conflating things in the same pattern that humans conflate things, which is actually a pretty big achievement for an algorithm now that I’ve spent more time thinking about it.
Does that make sense / still sound wrong to you? (If you get to it, sorry it’s not more concise.)