How so? It is a supervised learning problem: you have DNA markers as input features and self-reported race as the target class. If the model reaches >99% accuracy (*) I would say it performs pretty well.
(* The classes are skewed, but not extremely skewed. I don’t know if this accuracy has been corrected by class skew, but even if it hasn’t you wouldn’t get this accuracy unless the model didn’t work as intended).
A south American native with Black skin color can have more DNA in common with a Japanese than two native Africans from different parts of Africa.
Would this South American “native” self-identify as “black”?
How so? It is a supervised learning problem: you have DNA markers as input features and self-reported race as the target class. If the model reaches >99% accuracy (*) I would say it performs pretty well.
The point I wanted to make is that in the real world models in this area don’t have >99% accuracy.
Would this South American “native” self-identify as “black”?
That depends on the social environment. If they want to apply to an university that has a quota for Black students it wants to accept and their skin color is Black, there a good chance that they will put Black in the field that asks for the race.
How so? It is a supervised learning problem: you have DNA markers as input features and self-reported race as the target class. If the model reaches >99% accuracy (*) I would say it performs pretty well.
(* The classes are skewed, but not extremely skewed. I don’t know if this accuracy has been corrected by class skew, but even if it hasn’t you wouldn’t get this accuracy unless the model didn’t work as intended).
Would this South American “native” self-identify as “black”?
The point I wanted to make is that in the real world models in this area don’t have >99% accuracy.
That depends on the social environment. If they want to apply to an university that has a quota for Black students it wants to accept and their skin color is Black, there a good chance that they will put Black in the field that asks for the race.
The link many comments up suggests that we do in fact have >99% accuracy (when limited to major ethnic groups in the US).