Ethan Mendes comments on Reproducing ARC Evals’ recent report on language model agents