they ranked within the top 54% of the contestants.
If we limit the ratings to users who have participated in at least 1 contest in the last 6 months, AlphaCode is ranked (predict it first!) in the top 28%.
I didn’t get how is this comparison supposed to be meaningful. The main constraint for human programmers in code contests is the time limit. I mean, the last challenge on Codeforces gives you 2 hours and 30 minutes to solve 6 problems, which is not exactly a ton of time. You need to be a very good programmer to completely solve all the problems within the deadline, but any decent programmer should be able to solve the same problems within a week. I would claim that having more time to write dramatically improves your chance of producing correct code, in a way that having more time to make a chess move does not.
We can argue that they try to model this with the limit of 10 submissions per problem, but following the link I read this:
Removing the limit of 10 submissions can increase the solve rate further, reaching 49.7th percentile with an average of 29 submissions per solved problem.
It seems that the analogue of “give it plenty of time to write” makes the rank shift from top 54% to top 49,7%. Which is… not incredibly impressive? Did I forgot to read some details that would make the comparison more meaningful?
The 10 submissions per problem is meaningful, I think. I can see myself making 10 submissions to check my code against test cases. But I agree that the peak performance can require a lot more submissions, and make it incomparable to humans in this context. But you know, if you eliminate the time limit for some of these problems, I can see some people never getting the solution. So I don’t think this is too different. But I do expect the AI to perform worse in the no time limit case.
I didn’t get how is this comparison supposed to be meaningful. The main constraint for human programmers in code contests is the time limit. I mean, the last challenge on Codeforces gives you 2 hours and 30 minutes to solve 6 problems, which is not exactly a ton of time. You need to be a very good programmer to completely solve all the problems within the deadline, but any decent programmer should be able to solve the same problems within a week. I would claim that having more time to write dramatically improves your chance of producing correct code, in a way that having more time to make a chess move does not.
We can argue that they try to model this with the limit of 10 submissions per problem, but following the link I read this:
It seems that the analogue of “give it plenty of time to write” makes the rank shift from top 54% to top 49,7%. Which is… not incredibly impressive? Did I forgot to read some details that would make the comparison more meaningful?
The 10 submissions per problem is meaningful, I think. I can see myself making 10 submissions to check my code against test cases. But I agree that the peak performance can require a lot more submissions, and make it incomparable to humans in this context. But you know, if you eliminate the time limit for some of these problems, I can see some people never getting the solution. So I don’t think this is too different. But I do expect the AI to perform worse in the no time limit case.