Bogdan Ionut Cirstea comments on Bogdan Ionut Cirstea’s Shortform

Bogdan Ionut Cirstea 1 Aug 2024 10:34 UTC
5 points
2
56% on swebench-lite with repeated sampling (13% above previous SOTA; up from 15.9% with one sample to 56% with 250 samples), with a very-below-SOTA model https://arxiv.org/abs/2407.21787; anything automatically verifiable (large chunks of math and coding) seems like it’s gonna be automatable in < 5 years.
- Bogdan Ionut Cirstea 1 Aug 2024 12:33 UTC
  5 points
  0
  Parent
  The finding on the differential importance of verifiability also seems in line with the findings from Trading Off Compute in Training and Inference.