Isn’t it just a lot more interesting to bet on gold than on hardest problem?
The former seems like it averages out over lots of random bits of potential bad luck. And if the bot can do the hardest problem, wouldn’t it likely be able to do the other problems as well?
Problems 1 and 4 are often very easy and nearly-mechanical for humans. And it’s not uncommon for 2 of the other problems to happen to be easy as well (2/5 are much easier than 3⁄6 and it’s considered OK for them to be kind of brute-forceable, and I expect that you could solve most IMO geometry problems without any kind of AI if you tried hard enough, even the ones that take insight for a human).
Sometimes you can get a gold by solving only 4⁄6 of the problems, and that’s not very well-correlated with whether the problems happen to be easy for machine. And even when you need 5⁄6 it looks like sometimes you could get a very lucky break.
So if you get 4 swings, it looks fairly likely that one of them will be quite easy, and that dominates P(gold). E.g. I wouldn’t be too surprised by someone doing a bronze this year, having a swing next year and getting a silver or gold based on how lucky they get, and then if they get a silver trying again the next year just to be able to say they did it.
What I was expressing a more confident prediction about was an idea like “The hardest IMO problems, and especially the ad hoc problems, seem very hard for machines.” Amongst humans solving those problems is pretty well-correlated with getting golds, but unfortunately for machines it looks quite noisy and so my probability had to go way up.
Unfortunately “hardest problem” is also noisy in its own way (in large part because a mechanical definition of “hardest” isn’t good) so this isn’t a great fix.
Isn’t it just a lot more interesting to bet on gold than on hardest problem?
The former seems like it averages out over lots of random bits of potential bad luck. And if the bot can do the hardest problem, wouldn’t it likely be able to do the other problems as well?
Problems 1 and 4 are often very easy and nearly-mechanical for humans. And it’s not uncommon for 2 of the other problems to happen to be easy as well (2/5 are much easier than 3⁄6 and it’s considered OK for them to be kind of brute-forceable, and I expect that you could solve most IMO geometry problems without any kind of AI if you tried hard enough, even the ones that take insight for a human).
Sometimes you can get a gold by solving only 4⁄6 of the problems, and that’s not very well-correlated with whether the problems happen to be easy for machine. And even when you need 5⁄6 it looks like sometimes you could get a very lucky break.
So if you get 4 swings, it looks fairly likely that one of them will be quite easy, and that dominates P(gold). E.g. I wouldn’t be too surprised by someone doing a bronze this year, having a swing next year and getting a silver or gold based on how lucky they get, and then if they get a silver trying again the next year just to be able to say they did it.
What I was expressing a more confident prediction about was an idea like “The hardest IMO problems, and especially the ad hoc problems, seem very hard for machines.” Amongst humans solving those problems is pretty well-correlated with getting golds, but unfortunately for machines it looks quite noisy and so my probability had to go way up.
Unfortunately “hardest problem” is also noisy in its own way (in large part because a mechanical definition of “hardest” isn’t good) so this isn’t a great fix.
Hmm, in that case, would “all the problems” be better than either “hardest problem” or “gold”?