Aaron_Scher comments on LLM Evaluators Recognize and Favor Their Own Generations

Aaron_Scher 18 Apr 2024 17:11 UTC
1 point
0
For example, if the GPT-4 evaluator gave a weighted score of $2.0$ to a summary generated by Claude 2 and a weighted score of $3.0$ to its own summary for the same article, then its final normalized self-preference score for the Claude summary would be $2 / (2 + 3) = 0.4$ .
Should this be 3/(2+3) = 0.6? Not sure I’ve understood correctly.