ryan_greenblatt comments on Getting 50% (SoTA) on ARC-AGI with GPT-4o

ryan_greenblatt 18 Jun 2024 1:36 UTC
26 points
6
I endorse this comment for the record.

I’m considering editing the blog post to clarify.

If I had known that prior work got a wildly different score on the public test set (comparable to the score I get), I wouldn’t have claimed SOTA.

(That said, as you note, it seems reasonably likely (though unclear) that this prior solution was overfit to the test set while my solution is not.)
- ryan_greenblatt 21 Jun 2024 20:51 UTC
  3 points
  0
  Parent
  I’m submitting to the private leaderboard (with fewer samples than used in this post). If results indicate that SOTA is unlikely, I’ll retract my claim.
  - ryan_greenblatt 27 Jun 2024 18:59 UTC
    3 points
    0
    Parent
    I edited to add:
    (Edit: But see this comment and this comment for important clarifications.)
    And changed from “this dataset” to “a similarly difficult dataset”.