maximkazhenkov comments on Competitive programming with AlphaCode

maximkazhenkov 3 Feb 2022 14:34 UTC
1 point
What do you mean by “give it enough tries”?
- Algon 3 Feb 2022 14:56 UTC
  1 point
  Parent
  That is the sampling part of AlphaGo. When you ask it a question, if you let take more and more samples as potential solutions, the more likely you are to get a correct answer. The success rate of this seems to grow log linearly with the number of samples, which is pretty poor.
  - maximkazhenkov 3 Feb 2022 19:06 UTC
    1 point
    Parent
    Oh, I see, I thought you meant AlphaCode had some sort of recurrency (does it?), i.e. iterating over its answer like a human would when debugging code. Log-linear doesn’t sound bad, you wouldn’t expect a human to be able to write a correct program on the first try without debugging either, would you?
    - Algon 3 Feb 2022 19:21 UTC
      2 points
      Parent
      It is pretty bad because the constant matters here. If you have a method which achieves a success rate % of C log( 1+ k) where k is the number of samples then, if C =1, you’d need 2^100~10^30 many samples per inference to get all the competitive programing questions right. As it stands, C seems pretty small. For k=100000 you get solve maybe 33% of problems. And it isn’t much better than k=1000.
      That’s leaving aside the question of whether you will continue to get log-linear improvements, which I find fairly unlikely. Though if you could get C to something like 10 or 20 with one or two innovations, I’d be scared or even terrified.
      - Daniel Kokotajlo 3 Feb 2022 22:14 UTC
        2 points
        Parent
        To check my understanding:
        If C was 20, then 100,000 samples would be enough to solve approximately 100% of these problems?
        But instead C is something like 6-7, because 100,000 samples gets you to 33%?
        Algon 3 Feb 2022 22:28 UTC
        3 points
        Parent
        Yeah, basically, but I was using log_2(x) instead of log_10(x) in my made up example. Here’s some actual data. What they mean by “unlimited attempts” is that there is no filter on the sample solutions, so they are all submitted to codebase. I expect false positives/incredibly slow to be more likely than performance actually increasing without limit.