Afaik, in ML, the term bias is used to describe any move away from the uniform / mean case. But in common speech, such a move would only be called a bias if it’s inaccurate. So if the algorithm learns a true pattern in the data (X is more likely to be classified as 1 than Y is) that wouldn’t be called a bias. Unless I misunderstand your point.
1. No-one has access to the actual “re-offend” rates: all we have is “re-arrest,” “re-convict,” or at best “observed and reported re-offence” rates.
2. A-priori we do not expect the amount of melanin in a person’s skin, or the word they write down on a form next to the prompt “Race” to be correlated with the risk of re-offense. So, any tool that looks at “a bunch of factors” and comes up with “Black people are more likely to re-offend” is “biased” compared to our prior (even if our prior is wrong).
Afaik, in ML, the term bias is used to describe any move away from the uniform / mean case. But in common speech, such a move would only be called a bias if it’s inaccurate. So if the algorithm learns a true pattern in the data (X is more likely to be classified as 1 than Y is) that wouldn’t be called a bias. Unless I misunderstand your point.
1. No-one has access to the actual “re-offend” rates: all we have is “re-arrest,” “re-convict,” or at best “observed and reported re-offence” rates.
2. A-priori we do not expect the amount of melanin in a person’s skin, or the word they write down on a form next to the prompt “Race” to be correlated with the risk of re-offense. So, any tool that looks at “a bunch of factors” and comes up with “Black people are more likely to re-offend” is “biased” compared to our prior (even if our prior is wrong).
All evidence is “biased compared to our prior”. That is what evidence is.