If a job application with a name that common with black people gets rejected while an identical one with a name that’s common with white people gets accepted that would be an example of bad discrimination.
Does it matter if having said name is in fact correlated with job performance?
Only if it’s still correlated when you control for anything else on the CV and cover letter, incl. the fact that the candidate is not currently employed by anyone else.
Does it matter if having said name is in fact correlated with job performance?
Being correlated isn’t very valuable in itself. Even if you do believe that blacks on average have a lower IQ, scores on standardized test tell you a lot more about someone IQ.
The question would be whether the name is a better predictor of job performance than grades to distinguish people in the population of people who apply or whether the information that comes from the names adds additional predictive value.
But even if various proxies of social status would perform as predictors I still value high social mobility. Policies that increase it might not be in the interest of the particular employeer but of interest to society as a whole.
The question would be whether the name is a better predictor of job performance than grades to distinguish people in the population of people who apply or whether the information that comes from the names adds additional predictive value.
Emphasis mine. I don’t think this is the question at all, because you also have the grade information; the only question is if grades screen off evidence from names, which is your second option. It seems to me that the odds that the name provides no additional information are very low.
To the best of my knowledge, no studies have been done which submit applications where the obviously black names have higher qualifications in an attempt to determine how many GPA points an obviously black name costs an applicant. (Such an experiment seems much more difficult to carry out, and doesn’t have the same media appeal.)
So, this “only question” formulation is a little awkward and I’m not really sure what it means. For my part I endorse correctly using (grades + name) as evidence, and I doubt that doing so is at all common when it comes to socially marked names… that is, I expect that most people evaluate each source of information in isolation, failing to consider to what extent they actually overlap (aka, screen one another off).
So, this “only question” formulation is a little awkward and I’m not really sure what it means.
ChristianKI brought up the proposition “(name)>(grades)” where > means that the prediction accuracy is higher, but the truth or falsity of that proposition is irrelevant to whether or not it’s epistemically legitimate to include name in a decision, which is determined by “(name+grades)>(grades)”.
I doubt that doing so is at all common when it comes to socially marked names
Doing things correctly is, in general, uncommon. But the shift implied by moving from ‘current’ to ‘correct’ is not always obvious. For example, both nonsmokers and smokers overestimate the health costs of smoking, which suggests that if their estimates became more accurate, we might see more smokers, not less. It’s possible that hiring departments are actually less biased against people with obviously black names than they should be.
if their estimates became more accurate, we might see more smokers, not less
...insofar as their current and future estimates of health costs are well calibrated with their actual smoking behavior, at least. Sure.
It’s possible that hiring departments are actually less biased against people with obviously black names than they should be.
Well, it’s odd to use “bias” to describe using observations as evidence in ways that reliably allow more accurate predictions, but leaving the language aside, yes, I agree that it’s possible that hiring departments are not weighting names as much as they should be for maximum accuracy in isolation… in other words, that names are more reliable evidence than they are given credit for being.
That said, if I’m right that there is a significant overlap between the actual information provided by grades and by names, then evaluating each source of information in isolation without considering the overlap is nevertheless a significant error.
Now, it might be that the evidential weight of names is so great that the error due to not granting it enough weight overshadows the error due to double-counting, and it may be that the signs are such that double-counting leads to more accurate results than not double-couting. Here again, I agree that this is possible.
But even if that’s true, continuing to erroneously double-count in the hopes that our errors keep cancelling each other out isn’t as reliable a long-term strategy as starting to correctly use all the evidence we have.
That said, if I’m right that there is a significant overlap between the actual information provided by grades and by names, then evaluating each source of information in isolation without considering the overlap is nevertheless a significant error.
Agreed. Any sort of decision process which uses multiple pieces of information should be calibrated on all of those pieces of information together whenever possible.
It’s even possible that if the costs of smoking are overestimated, more people should be smoking—part of the campaign against smoking is to underestimate the pleasures and social benefits of smoking.
For example, both nonsmokers and smokers overestimate the health costs of smoking, which suggests that if their estimates became more accurate, we might see more smokers, not less.
That in no way implies that it would be a good choice for people to smoke more. People don’t make those decisions through rational analysis.
Emphasis mine. I don’t think this is the question at all, because you also have the grade information; the only question is if grades screen off evidence from names, which is your second option. It seems to me that the odds that the name provides no additional information are very low.
If you combine a low noise signal with a high noise signal the combined signal can be of medium noise. Combining information isn’t always useful if you want to use both signal as proxy for the same thing.
For combining information in such a way you would have to believe that the average black with a IQ of 120 will get a higher GPA score than the average white person of the same IQ.
I think there little reason to believe that’s true.
Without actually running a factor analysis on the outcomes of hiring decision it will be very difficult to know in which direction it would correct the decision.
Even if you do run factor analysis integrating addtional variables costs you degrees of freedom so it not always a good choice to integrate as much variables as possible in your model. Simple models often outperform more complicated ones.
Human’s are also not good at combining multiple sources of information.
If you combine a low noise signal with a high noise signal the combined signal can be of medium noise. Combining information isn’t always useful if you want to use both signal as proxy for the same thing.
Agreed that if you have P(A|B) and P(A|C), then you don’t have enough to get P(A|BC).
But if you have the right objects and they’re well-calibrated, then adding in a new measurement always improves your estimate. (You might not be sure that they’re well-calibrated, in which case it might make sense to not include them, and that can obviously include trying to estimate P(A|BC) from P(A|C) and P(A|B).)
For combining information in such a way you would have to believe that the average black with a IQ of 120 will get a higher GPA score than the average white person of the same IQ.
Not quite. Regression to the mean implies that you should apply shrinkage which is as specific as possible, but this shrinkage should obviously be applied to all applicants. (Regressing black scores to the mean, and not regressing white scores, for example, is obviously epistemic malfeasance, but regressing black scores to the black mean and white scores to the white mean makes sense, even if the IQ-grades relationship is the same for blacks and whites.)
It could also be that the GPA-job performance link is different for whites and blacks, even if the IQ-GPA link is the same for whites and blacks. (And, of course, race could impact job performance directly, but it seems likely the effects should be indirect for almost all jobs.)
I think there little reason to believe that’s true.
If you’re just comparing GPAs, rather than GPAs weighted by course difficulty, there could be a systematic difference in the difficulty of classes that applicants take by race. I’ve had a hard time getting numerical data on this, for obvious reasons, but there are rumors that some institutions may have a grade bias in favor of blacks. (Obviously, you can’t fit a parameter to a rumor, but this is reason to not discount an effect that you do see in your data.)
Simple models often outperform more complicated ones.
Yes, but… motivated cognition alert. If you’re building models correctly, you take this into account by default, and so there’s no point in bringing it up for any particular input because you should already be checking it for every input.
For combining information in such a way you would have to believe that the average black with a IQ of 120 will get a higher GPA score than the average white person.
I think there little reason to believe that’s true.
Could you explain your reasoning here?
IQ is a strong predictor of academic performance, and a 1.5 sd gap is a fairly significant difference. The only thing I could think of to counterbalance it so that the average white would get a higher GPA would be through fairly severe racial biases in grading policies in their favor, which seems at odds with the legally-enforced racial biases in admissions / graduation operating in the opposite direction. Not to mention that black African immigrants, legal ones anyway, seem to be the prototype of high-IQ blacks who outperform average whites.
I am a little puzzled by the claim, which leads me to believe I’ve misunderstood you somehow or overlooked something fairly important.
Does it matter if having said name is in fact correlated with job performance?
Only if it’s still correlated when you control for anything else on the CV and cover letter, incl. the fact that the candidate is not currently employed by anyone else.
Being correlated isn’t very valuable in itself. Even if you do believe that blacks on average have a lower IQ, scores on standardized test tell you a lot more about someone IQ.
The question would be whether the name is a better predictor of job performance than grades to distinguish people in the population of people who apply or whether the information that comes from the names adds additional predictive value.
But even if various proxies of social status would perform as predictors I still value high social mobility. Policies that increase it might not be in the interest of the particular employeer but of interest to society as a whole.
Emphasis mine. I don’t think this is the question at all, because you also have the grade information; the only question is if grades screen off evidence from names, which is your second option. It seems to me that the odds that the name provides no additional information are very low.
To the best of my knowledge, no studies have been done which submit applications where the obviously black names have higher qualifications in an attempt to determine how many GPA points an obviously black name costs an applicant. (Such an experiment seems much more difficult to carry out, and doesn’t have the same media appeal.)
So, this “only question” formulation is a little awkward and I’m not really sure what it means. For my part I endorse correctly using (grades + name) as evidence, and I doubt that doing so is at all common when it comes to socially marked names… that is, I expect that most people evaluate each source of information in isolation, failing to consider to what extent they actually overlap (aka, screen one another off).
ChristianKI brought up the proposition “(name)>(grades)” where > means that the prediction accuracy is higher, but the truth or falsity of that proposition is irrelevant to whether or not it’s epistemically legitimate to include name in a decision, which is determined by “(name+grades)>(grades)”.
Doing things correctly is, in general, uncommon. But the shift implied by moving from ‘current’ to ‘correct’ is not always obvious. For example, both nonsmokers and smokers overestimate the health costs of smoking, which suggests that if their estimates became more accurate, we might see more smokers, not less. It’s possible that hiring departments are actually less biased against people with obviously black names than they should be.
...insofar as their current and future estimates of health costs are well calibrated with their actual smoking behavior, at least. Sure.
Well, it’s odd to use “bias” to describe using observations as evidence in ways that reliably allow more accurate predictions, but leaving the language aside, yes, I agree that it’s possible that hiring departments are not weighting names as much as they should be for maximum accuracy in isolation… in other words, that names are more reliable evidence than they are given credit for being.
That said, if I’m right that there is a significant overlap between the actual information provided by grades and by names, then evaluating each source of information in isolation without considering the overlap is nevertheless a significant error.
Now, it might be that the evidential weight of names is so great that the error due to not granting it enough weight overshadows the error due to double-counting, and it may be that the signs are such that double-counting leads to more accurate results than not double-couting. Here again, I agree that this is possible.
But even if that’s true, continuing to erroneously double-count in the hopes that our errors keep cancelling each other out isn’t as reliable a long-term strategy as starting to correctly use all the evidence we have.
Agreed. Any sort of decision process which uses multiple pieces of information should be calibrated on all of those pieces of information together whenever possible.
It’s even possible that if the costs of smoking are overestimated, more people should be smoking—part of the campaign against smoking is to underestimate the pleasures and social benefits of smoking.
That in no way implies that it would be a good choice for people to smoke more. People don’t make those decisions through rational analysis.
If you combine a low noise signal with a high noise signal the combined signal can be of medium noise. Combining information isn’t always useful if you want to use both signal as proxy for the same thing.
For combining information in such a way you would have to believe that the average black with a IQ of 120 will get a higher GPA score than the average white person of the same IQ.
I think there little reason to believe that’s true.
Without actually running a factor analysis on the outcomes of hiring decision it will be very difficult to know in which direction it would correct the decision.
Even if you do run factor analysis integrating addtional variables costs you degrees of freedom so it not always a good choice to integrate as much variables as possible in your model. Simple models often outperform more complicated ones.
Human’s are also not good at combining multiple sources of information.
Agreed that if you have P(A|B) and P(A|C), then you don’t have enough to get P(A|BC).
But if you have the right objects and they’re well-calibrated, then adding in a new measurement always improves your estimate. (You might not be sure that they’re well-calibrated, in which case it might make sense to not include them, and that can obviously include trying to estimate P(A|BC) from P(A|C) and P(A|B).)
Not quite. Regression to the mean implies that you should apply shrinkage which is as specific as possible, but this shrinkage should obviously be applied to all applicants. (Regressing black scores to the mean, and not regressing white scores, for example, is obviously epistemic malfeasance, but regressing black scores to the black mean and white scores to the white mean makes sense, even if the IQ-grades relationship is the same for blacks and whites.)
It could also be that the GPA-job performance link is different for whites and blacks, even if the IQ-GPA link is the same for whites and blacks. (And, of course, race could impact job performance directly, but it seems likely the effects should be indirect for almost all jobs.)
If you’re just comparing GPAs, rather than GPAs weighted by course difficulty, there could be a systematic difference in the difficulty of classes that applicants take by race. I’ve had a hard time getting numerical data on this, for obvious reasons, but there are rumors that some institutions may have a grade bias in favor of blacks. (Obviously, you can’t fit a parameter to a rumor, but this is reason to not discount an effect that you do see in your data.)
Yes, but… motivated cognition alert. If you’re building models correctly, you take this into account by default, and so there’s no point in bringing it up for any particular input because you should already be checking it for every input.
Could you explain your reasoning here?
IQ is a strong predictor of academic performance, and a 1.5 sd gap is a fairly significant difference. The only thing I could think of to counterbalance it so that the average white would get a higher GPA would be through fairly severe racial biases in grading policies in their favor, which seems at odds with the legally-enforced racial biases in admissions / graduation operating in the opposite direction. Not to mention that black African immigrants, legal ones anyway, seem to be the prototype of high-IQ blacks who outperform average whites.
I am a little puzzled by the claim, which leads me to believe I’ve misunderstood you somehow or overlooked something fairly important.
I missed the qualification of speaking of whites with the same IQ. I added it via an edit.
Right, okay. I did misunderstand you. I’ll correct my comment as soon as I figure out the strikethrough function here.
I believe the primary way to get strikethrough is to strikethrough the entire comment, by retracting it.
You can use unicode.
I’d recommend Vaniver’s solution instead—IME Android phones don’t like yours.