The NN seems to make a bigger difference for actual IMO problems, going from 18⁄30 to 25⁄30, but maybe it’s overfit.
Interesting. Do you know where in the paper that was?
See Table 1. In particular, the comparison between “DD + AR + human-designed heuristics” and “AlphaGeometry”.
The NN seems to make a bigger difference for actual IMO problems, going from 18⁄30 to 25⁄30, but maybe it’s overfit.
Interesting. Do you know where in the paper that was?
See Table 1. In particular, the comparison between “DD + AR + human-designed heuristics” and “AlphaGeometry”.