Didn’t you just show that “machines are biased because it learns from history and history is biased” is indeed the case? The base rates differ because of historical circumstances.
I’m following common speech where “biased” means “statistically immoral, because it violates some fairness requirement”.
I showed that with base rate difference, it’s impossible to satisfy three fairness requirements. The decider (machine or not) can completely ignore history. It could be a coin-flipper. As long as the decider is imperfect, it would still be unfair in one of the fairness requirements.
And if the base rates are not due to historical circumstances, this impossibility still stands.
I’m not sure what “statistically immoral” means nor have I ever heard the term, which makes me doubt it’s common speech (googling it does not bring up any uses of the phrase).
I think we’re using the term “historical circumstances” differently; I simply mean what’s happened in the past. Isn’t the base rate purely a function of the records of white/black convictions? If so, then the fact that the rates are not the same is the reason that we run into this fairness problem. I agree that this problem can apply in other settings, but in the case where the base rate is a function of history, is it not accurate to say that the cause of the conundrum is historical circumstances? An alternative history with equal, or essentially equal, rates of convictions would not suffer from this problem, right?
I think what people mean when they say things like “machines are biased because they learn from history and history is biased” is precisely this scenario: historically, conviction rates are not equal between racial groups and so any algorithm that learns to predict convictions based on historical data will inevitably suffer from the same inequality (or suffer from some other issue by trying to fix this one, as your analysis has shown).
No. Any decider will be unfair in some way, whether it knows anything about history at all. The decider can be a coin flipper and it would still be biased. One can say that the unfairness is baked into the reality of base-rate difference.
The only way to fix this is not fixing the decider, but to just somehow make the base-rate difference disappear, or to compromise on the definition of fairness so that it’s not so stringent, and satisfiable.
And in common language and common discussion of algorithmic bias, “bias” is decidedly NOT merely a statistical definition. It always contains a moral judgment: violation of a fairness requirement. To say that a decider is biased is to say that the statistical pattern of its decision violates a fairness requirement.
The key message is that, by the common language definition, “bias” is unavoidable. No amount of trying to fix the decider will make it fair. Blinding it to the history will do nothing. The unfairness is in the base rate, and in the definition of fairness.
The base rates in the diagram are not historical but “potential” rates. They show the proportion of current inmates up for parole who would be re-arrested if paroled. In practice this is indeed estimated by looking at historical rates but as long as the true base rates are different in reality, no algorithm can be fair in the two senses described above.
Afaik, in ML, the term bias is used to describe any move away from the uniform / mean case. But in common speech, such a move would only be called a bias if it’s inaccurate. So if the algorithm learns a true pattern in the data (X is more likely to be classified as 1 than Y is) that wouldn’t be called a bias. Unless I misunderstand your point.
1. No-one has access to the actual “re-offend” rates: all we have is “re-arrest,” “re-convict,” or at best “observed and reported re-offence” rates.
2. A-priori we do not expect the amount of melanin in a person’s skin, or the word they write down on a form next to the prompt “Race” to be correlated with the risk of re-offense. So, any tool that looks at “a bunch of factors” and comes up with “Black people are more likely to re-offend” is “biased” compared to our prior (even if our prior is wrong).
Didn’t you just show that “machines are biased because it learns from history and history is biased” is indeed the case? The base rates differ because of historical circumstances.
I’m following common speech where “biased” means “statistically immoral, because it violates some fairness requirement”.
I showed that with base rate difference, it’s impossible to satisfy three fairness requirements. The decider (machine or not) can completely ignore history. It could be a coin-flipper. As long as the decider is imperfect, it would still be unfair in one of the fairness requirements.
And if the base rates are not due to historical circumstances, this impossibility still stands.
I’m not sure what “statistically immoral” means nor have I ever heard the term, which makes me doubt it’s common speech (googling it does not bring up any uses of the phrase).
I think we’re using the term “historical circumstances” differently; I simply mean what’s happened in the past. Isn’t the base rate purely a function of the records of white/black convictions? If so, then the fact that the rates are not the same is the reason that we run into this fairness problem. I agree that this problem can apply in other settings, but in the case where the base rate is a function of history, is it not accurate to say that the cause of the conundrum is historical circumstances? An alternative history with equal, or essentially equal, rates of convictions would not suffer from this problem, right?
I think what people mean when they say things like “machines are biased because they learn from history and history is biased” is precisely this scenario: historically, conviction rates are not equal between racial groups and so any algorithm that learns to predict convictions based on historical data will inevitably suffer from the same inequality (or suffer from some other issue by trying to fix this one, as your analysis has shown).
No. Any decider will be unfair in some way, whether it knows anything about history at all. The decider can be a coin flipper and it would still be biased. One can say that the unfairness is baked into the reality of base-rate difference.
The only way to fix this is not fixing the decider, but to just somehow make the base-rate difference disappear, or to compromise on the definition of fairness so that it’s not so stringent, and satisfiable.
And in common language and common discussion of algorithmic bias, “bias” is decidedly NOT merely a statistical definition. It always contains a moral judgment: violation of a fairness requirement. To say that a decider is biased is to say that the statistical pattern of its decision violates a fairness requirement.
The key message is that, by the common language definition, “bias” is unavoidable. No amount of trying to fix the decider will make it fair. Blinding it to the history will do nothing. The unfairness is in the base rate, and in the definition of fairness.
The base rates in the diagram are not historical but “potential” rates. They show the proportion of current inmates up for parole who would be re-arrested if paroled. In practice this is indeed estimated by looking at historical rates but as long as the true base rates are different in reality, no algorithm can be fair in the two senses described above.
Afaik, in ML, the term bias is used to describe any move away from the uniform / mean case. But in common speech, such a move would only be called a bias if it’s inaccurate. So if the algorithm learns a true pattern in the data (X is more likely to be classified as 1 than Y is) that wouldn’t be called a bias. Unless I misunderstand your point.
1. No-one has access to the actual “re-offend” rates: all we have is “re-arrest,” “re-convict,” or at best “observed and reported re-offence” rates.
2. A-priori we do not expect the amount of melanin in a person’s skin, or the word they write down on a form next to the prompt “Race” to be correlated with the risk of re-offense. So, any tool that looks at “a bunch of factors” and comes up with “Black people are more likely to re-offend” is “biased” compared to our prior (even if our prior is wrong).
All evidence is “biased compared to our prior”. That is what evidence is.