Abe comments on BIG-Bench Canary Contamination in GPT-4

Abe Oct 24, 2024, 4:14 PM
53 points
1
It should be pointed out that the original paper/press release describing GPT-4 explicitly says that they found that BIG-bench had contaminated their training data, and therefore excluded it as an evaluation. As far as I know there was no similar disclosure for claude or other models. See footnote 5 here: https://arxiv.org/abs/2303.08774v1