A nice test might be the 2024 IMO (from July). I’m curious to see if it’s reached gold medal performance on that.
The IMO Grand Challenge might be harder; I don’t know how Lean works, but it’s probably harder to write than human-readable LaTeX.
OpenAI would have mentioned if they had reached gold on the IMO.
A nice test might be the 2024 IMO (from July). I’m curious to see if it’s reached gold medal performance on that.
The IMO Grand Challenge might be harder; I don’t know how Lean works, but it’s probably harder to write than human-readable LaTeX.
OpenAI would have mentioned if they had reached gold on the IMO.