Based on the other thread I now want to revise this prediction, both because 4% was too low and “IMO gold” has a lot of noise in it based on test difficulty.
I’d put 4% on “For the 2022, 2023, 2024, or 2025 IMO an AI built before the IMO is able to solve the single hardest problem” where “hardest problem” = “usually problem #6, but use problem #3 instead if either: (i) problem 6 is geo or (ii) problem 3 is combinatorics and problem 6 is algebra.” (Would prefer just pick the hardest problem after seeing the test but seems better to commit to a procedure.)
Maybe I’ll go 8% on “gets gold” instead of “solves hardest problem.”
Would be good to get your updated view on this so that we can treat it as staked out predictions.
(News: OpenAI has built a theorem-prover that solved many AMC12 and AIME competition problems, and 2 IMO problems, and they say they hope this leads to work that wins the IMO Grand Challenge.)
Based on the other thread I now want to revise this prediction, both because 4% was too low and “IMO gold” has a lot of noise in it based on test difficulty.
I’d put 4% on “For the 2022, 2023, 2024, or 2025 IMO an AI built before the IMO is able to solve the single hardest problem” where “hardest problem” = “usually problem #6, but use problem #3 instead if either: (i) problem 6 is geo or (ii) problem 3 is combinatorics and problem 6 is algebra.” (Would prefer just pick the hardest problem after seeing the test but seems better to commit to a procedure.)
Maybe I’ll go 8% on “gets gold” instead of “solves hardest problem.”
Would be good to get your updated view on this so that we can treat it as staked out predictions.
(News: OpenAI has built a theorem-prover that solved many AMC12 and AIME competition problems, and 2 IMO problems, and they say they hope this leads to work that wins the IMO Grand Challenge.)