You might find something like this in market research. Certainly the sort of analysis that predicts which advertisements are relevant to a user on sites like Facebook would be similar to this. Trying to answer a question like “Which advertisement will the user be most receptive to given this cluster of traits?”, where the traits are your likes / dislikes / music / etc.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
If I had enough data points of people’s personality traits, I could stick it in something like Weka, look for empirical clusters (using something like k-means or hierarchical clustering, and so forth), then train a number of classifiers to sort individual people into these clusters given a limited number of personality trait observations.
There are all sorts of forms these classifiers could take. You could do the same sort of thing wedrifed is thinking of: assume that traits are independent and use the p(personality type | trait) values that have the most predictive power to classify a person. This would be a naive Bayes classifier, of the sort that’s fantastically effective at spam filtering.
If you wanted to make something simpler—perhaps something you could print out as a handy pocket guide to classifying people—you could use a decision tree. That’s like a precomputed strategy for playing 20 questions, where you only ask questions whose answers pay rent. It’s approximate, but it can work surprisingly well. A related method is to build several randomized decision trees and have them vote.
Of course, once you build a classifier, that’s a hypothesis about some structure in reality. You need to test that hypothesis before you rush forth and start putting your trust in it. For that, you can hold some of the data in reserve, and see how a classifier built from the rest of the data performs on it. If you break your data up into n groups and take turns letting each group be the testing data set, this can tell you if your general method for generating classifiers is working for this data set.
Of course this is all terribly ad-hoc, but the Bayesian ideal approach is hard to compute here, and often these hacks work surprisingly well.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
And yet, this is exactly what personality tests must rely on and the sort of thing that we are doing when we follow the advice in the post. Access to even the raw data used when creating the ‘big five’ would be useful.
No, the article specifically warns against using a single trait. It gives specific examples of how a single trait can mean very different things. It takes a cluster of traits to establish something useful.
If you want to pursue getting the data, though, you could try to derive something like a table of probabilities from a self scored ‘Big Five’ test, like the one in the appendix of this review paper. From that same review paper you can also find the papers and data sets that gave rise to five factor personality analysis.
You might find something like this in market research. Certainly the sort of analysis that predicts which advertisements are relevant to a user on sites like Facebook would be similar to this. Trying to answer a question like “Which advertisement will the user be most receptive to given this cluster of traits?”, where the traits are your likes / dislikes / music / etc.
This isn’t exactly what you’re asking for, but I doubt there is a P(personality type | trait) table anywhere. You’re talking about a high-dimensional space and a single trait does not have much predictive power in isolation.
If I had enough data points of people’s personality traits, I could stick it in something like Weka, look for empirical clusters (using something like k-means or hierarchical clustering, and so forth), then train a number of classifiers to sort individual people into these clusters given a limited number of personality trait observations.
There are all sorts of forms these classifiers could take. You could do the same sort of thing wedrifed is thinking of: assume that traits are independent and use the p(personality type | trait) values that have the most predictive power to classify a person. This would be a naive Bayes classifier, of the sort that’s fantastically effective at spam filtering.
If you wanted to make something simpler—perhaps something you could print out as a handy pocket guide to classifying people—you could use a decision tree. That’s like a precomputed strategy for playing 20 questions, where you only ask questions whose answers pay rent. It’s approximate, but it can work surprisingly well. A related method is to build several randomized decision trees and have them vote.
Of course, once you build a classifier, that’s a hypothesis about some structure in reality. You need to test that hypothesis before you rush forth and start putting your trust in it. For that, you can hold some of the data in reserve, and see how a classifier built from the rest of the data performs on it. If you break your data up into n groups and take turns letting each group be the testing data set, this can tell you if your general method for generating classifiers is working for this data set.
Of course this is all terribly ad-hoc, but the Bayesian ideal approach is hard to compute here, and often these hacks work surprisingly well.
And yet, this is exactly what personality tests must rely on and the sort of thing that we are doing when we follow the advice in the post. Access to even the raw data used when creating the ‘big five’ would be useful.
No, the article specifically warns against using a single trait. It gives specific examples of how a single trait can mean very different things. It takes a cluster of traits to establish something useful.
If you want to pursue getting the data, though, you could try to derive something like a table of probabilities from a self scored ‘Big Five’ test, like the one in the appendix of this review paper. From that same review paper you can also find the papers and data sets that gave rise to five factor personality analysis.
edit: fixed the link.