The whole problem with “Human raters make systematic errors” is that this is likely to happen to the heavily scrutinized ground truth. If you have a way of creating a correct ground truth that avoids this problem, you don’t need the second model, you can just use that as the dataset for the first model.
The whole problem with “Human raters make systematic errors” is that this is likely to happen to the heavily scrutinized ground truth. If you have a way of creating a correct ground truth that avoids this problem, you don’t need the second model, you can just use that as the dataset for the first model.