Qumeric comments on Stephen McAleese’s Shortform

Qumeric 15 Sep 2024 16:17 UTC
8 points
0
I would like to note that this dataset is not as hard as it might look like. Humans performed not so well because there is a strict time limit, I don’t remember exactly but it was something like 1 hour for 25 tasks (and IIRC the medalist only made arithmetic errors). I am pretty sure any IMO gold medailst would typically score 100% given (say) 3 hours.

Nevertheless, it’s very impressive, and AIMO results are even more impressive in my opinion.