Reading this makes me really wish I had access to some data. p(personality type | trait). That would make reading people as easy as counting cards in blackjack! Surely there is this kind of data out there… and if not why not?!
Changing trends? Take hippy clothes. In the 60s a probable indicator of promiscuity, nowadays more likely to be a green (who are not known for their free love).
Also if it were collated and published humans would be somewhat anti-inductive if possible. For example if liking cats had a low mutual probability with sociopathy, and this was widely known, then sociopaths would pretend to like cats in order to avoid detection.
Do you really think people making studies giving numbers to trends that are already widely known would make a difference? Most people pick up this kind of data intuitively but would never consider memorizing a table buried in the middle of a research paper. Having the figures just makes it easier to learn empathy in a systematic way. (Which, incidentally, few people would even be capable of.)
When conducting surveillance across a diverse population, having this information would certainly be useful. What proportion of shoplifters carry large bags? What proportion of bag carriers are shoplifters?
Come to think of it, perhaps this is how airline safety ought to work?
Any sort of predictive field of individual behavior ought to be able to make use of this data. Especially useful if you can tie in some computer assisted image tagging.
I suspect not everyone knows every trend. Lots of high class people might not know about straight edge punks. I also suspect that someone will write a pop-sci book about it if it interesting.
You might find something like this in market research. Certainly the sort of analysis that predicts which advertisements are relevant to a user on sites like Facebook would be similar to this. Trying to answer a question like “Which advertisement will the user be most receptive to given this cluster of traits?”, where the traits are your likes / dislikes / music / etc.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
If I had enough data points of people’s personality traits, I could stick it in something like Weka, look for empirical clusters (using something like k-means or hierarchical clustering, and so forth), then train a number of classifiers to sort individual people into these clusters given a limited number of personality trait observations.
There are all sorts of forms these classifiers could take. You could do the same sort of thing wedrifed is thinking of: assume that traits are independent and use the p(personality type | trait) values that have the most predictive power to classify a person. This would be a naive Bayes classifier, of the sort that’s fantastically effective at spam filtering.
If you wanted to make something simpler—perhaps something you could print out as a handy pocket guide to classifying people—you could use a decision tree. That’s like a precomputed strategy for playing 20 questions, where you only ask questions whose answers pay rent. It’s approximate, but it can work surprisingly well. A related method is to build several randomized decision trees and have them vote.
Of course, once you build a classifier, that’s a hypothesis about some structure in reality. You need to test that hypothesis before you rush forth and start putting your trust in it. For that, you can hold some of the data in reserve, and see how a classifier built from the rest of the data performs on it. If you break your data up into n groups and take turns letting each group be the testing data set, this can tell you if your general method for generating classifiers is working for this data set.
Of course this is all terribly ad-hoc, but the Bayesian ideal approach is hard to compute here, and often these hacks work surprisingly well.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
And yet, this is exactly what personality tests must rely on and the sort of thing that we are doing when we follow the advice in the post. Access to even the raw data used when creating the ‘big five’ would be useful.
No, the article specifically warns against using a single trait. It gives specific examples of how a single trait can mean very different things. It takes a cluster of traits to establish something useful.
If you want to pursue getting the data, though, you could try to derive something like a table of probabilities from a self scored ‘Big Five’ test, like the one in the appendix of this review paper. From that same review paper you can also find the papers and data sets that gave rise to five factor personality analysis.
Reading this makes me really wish I had access to some data. p(personality type | trait). That would make reading people as easy as counting cards in blackjack! Surely there is this kind of data out there… and if not why not?!
Changing trends? Take hippy clothes. In the 60s a probable indicator of promiscuity, nowadays more likely to be a green (who are not known for their free love).
Also if it were collated and published humans would be somewhat anti-inductive if possible. For example if liking cats had a low mutual probability with sociopathy, and this was widely known, then sociopaths would pretend to like cats in order to avoid detection.
Do you really think people making studies giving numbers to trends that are already widely known would make a difference? Most people pick up this kind of data intuitively but would never consider memorizing a table buried in the middle of a research paper. Having the figures just makes it easier to learn empathy in a systematic way. (Which, incidentally, few people would even be capable of.)
When conducting surveillance across a diverse population, having this information would certainly be useful. What proportion of shoplifters carry large bags? What proportion of bag carriers are shoplifters?
Come to think of it, perhaps this is how airline safety ought to work?
Any sort of predictive field of individual behavior ought to be able to make use of this data. Especially useful if you can tie in some computer assisted image tagging.
I suspect not everyone knows every trend. Lots of high class people might not know about straight edge punks. I also suspect that someone will write a pop-sci book about it if it interesting.
You might find something like this in market research. Certainly the sort of analysis that predicts which advertisements are relevant to a user on sites like Facebook would be similar to this. Trying to answer a question like “Which advertisement will the user be most receptive to given this cluster of traits?”, where the traits are your likes / dislikes / music / etc.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
If I had enough data points of people’s personality traits, I could stick it in something like Weka, look for empirical clusters (using something like k-means or hierarchical clustering, and so forth), then train a number of classifiers to sort individual people into these clusters given a limited number of personality trait observations.
There are all sorts of forms these classifiers could take. You could do the same sort of thing wedrifed is thinking of: assume that traits are independent and use the p(personality type | trait) values that have the most predictive power to classify a person. This would be a naive Bayes classifier, of the sort that’s fantastically effective at spam filtering.
If you wanted to make something simpler—perhaps something you could print out as a handy pocket guide to classifying people—you could use a decision tree. That’s like a precomputed strategy for playing 20 questions, where you only ask questions whose answers pay rent. It’s approximate, but it can work surprisingly well. A related method is to build several randomized decision trees and have them vote.
Of course, once you build a classifier, that’s a hypothesis about some structure in reality. You need to test that hypothesis before you rush forth and start putting your trust in it. For that, you can hold some of the data in reserve, and see how a classifier built from the rest of the data performs on it. If you break your data up into n groups and take turns letting each group be the testing data set, this can tell you if your general method for generating classifiers is working for this data set.
Of course this is all terribly ad-hoc, but the Bayesian ideal approach is hard to compute here, and often these hacks work surprisingly well.
And yet, this is exactly what personality tests must rely on and the sort of thing that we are doing when we follow the advice in the post. Access to even the raw data used when creating the ‘big five’ would be useful.
No, the article specifically warns against using a single trait. It gives specific examples of how a single trait can mean very different things. It takes a cluster of traits to establish something useful.
If you want to pursue getting the data, though, you could try to derive something like a table of probabilities from a self scored ‘Big Five’ test, like the one in the appendix of this review paper. From that same review paper you can also find the papers and data sets that gave rise to five factor personality analysis.
edit: fixed the link.