Thanks for this. In principle, you could use KBCs for any kind of evaluation, including evaluation of products, texts (essay grading, application letters, life plans, etc), pictures (which of my pictures is the best?), etc. The judicial system is very high-stakes and probably highly resistant to reform, whereas some of the contexts I list are much lower stakes. It might be better to try out KBCs in such a low-stakes context (I’m not sure which one would be best). I don’t know what extent KBCs have tested for these kinds of purposes (it was some time since I looked into these issues, and I’ve forgotten a bit). That would be good to look into.
One possible issue that one would have to overcome is explicit collusion among subsets of raters. Another is, as you say, that people might converge on some salient characteristics that are easily observable but don’t track what you’re interested in (this could at least in some cases be seen as a form of “tacit collusion”).
My impression is that collusion is a serious problems for ratings or recommender systems (which KBCs can be seen as a type of) in general. As a rule of thumb, people might be more inclined to engage in collusion when the stakes are higher.
To prevent that, one option would be to have a small number of known trustworthy experts, who also make evaluations which function as a sort of spot checks. Disagreement with those experts could be heavily penalised, especially if there are signs that the disagreement is due to (either tacit or explicit) collusion. But in the end only any anti-collusion measure needs to be tested empirically.
Relatedly, once people have a history of ratings, you may want to give disproportionate weights to those with a strong track record. Such epistocratic systems can be more efficient than democratic systems. See Thirteen Theorems in Search of the Truth.
KBCs can also be seen as a kind of prediction contests, where you’re trying to predict other people’s judgements. Hence there might be synergies with other forms of work on predictions.
Thanks for this. In principle, you could use KBCs for any kind of evaluation, including evaluation of products, texts (essay grading, application letters, life plans, etc), pictures (which of my pictures is the best?), etc. The judicial system is very high-stakes and probably highly resistant to reform, whereas some of the contexts I list are much lower stakes. It might be better to try out KBCs in such a low-stakes context (I’m not sure which one would be best). I don’t know what extent KBCs have tested for these kinds of purposes (it was some time since I looked into these issues, and I’ve forgotten a bit). That would be good to look into.
One possible issue that one would have to overcome is explicit collusion among subsets of raters. Another is, as you say, that people might converge on some salient characteristics that are easily observable but don’t track what you’re interested in (this could at least in some cases be seen as a form of “tacit collusion”).
My impression is that collusion is a serious problems for ratings or recommender systems (which KBCs can be seen as a type of) in general. As a rule of thumb, people might be more inclined to engage in collusion when the stakes are higher.
To prevent that, one option would be to have a small number of known trustworthy experts, who also make evaluations which function as a sort of spot checks. Disagreement with those experts could be heavily penalised, especially if there are signs that the disagreement is due to (either tacit or explicit) collusion. But in the end only any anti-collusion measure needs to be tested empirically.
Relatedly, once people have a history of ratings, you may want to give disproportionate weights to those with a strong track record. Such epistocratic systems can be more efficient than democratic systems. See Thirteen Theorems in Search of the Truth.
KBCs can also be seen as a kind of prediction contests, where you’re trying to predict other people’s judgements. Hence there might be synergies with other forms of work on predictions.