habryka comments on OpenAI o1, Llama 4, and AlphaZero of LLMs

habryka 15 Sep 2024 15:49 UTC
13 points
11
You’ve linked to this Sakana AI paper like 8 times in the last week. IMO, please stop, it’s complete bunk, basically a scam.

I don’t think it being bunk is really any evidence against automated ML research becoming a thing soon, or even already a thing, but the fact that you keep linking to it while ignoring the huge errors in it, and pretend it proves some point, is frustrating.
- Noosphere89 15 Sep 2024 16:13 UTC
  8 points
  2
  Parent
  As someone who dislikes the hype over Sakana AI, and agrees that Bogdan should stop linking it so much, I think that it’s less of a scam, and more like an overhyped product that was not ready for primetime or the discussion it got.
  
  The discourse was not good on Sakana AI, but I do think that it has some uses, just not nearly as much as people want Sakana AI to be.
- Vladimir_Nesov 15 Sep 2024 18:07 UTC
  6 points
  1
  Parent
  
  and pretend it proves some point
  
  Well known complete bunk can be useful for gesturing at an idea, even as it gives no evidence about related facts. It can be a good explanatory tool when there is little risk that people will take away related invalid inferences.
  
  (Someone strong-downvoted Bogdan’s comment, which I opposed with a strong-upvote, since it doesn’t by itself seem to be committing the error of believing the Sakana hype, and it gates my reply that I don’t want to be hidden because the comment it happens to be in reply to gets to the deep negatives in Karma.)
  - Bogdan Ionut Cirstea 15 Sep 2024 19:01 UTC
    7 points
    2
    Parent
    To maybe further clarify, I think of the Sakana paper roughly like how I think of autoGPT. LM agents were overhyped initially and autoGPT specifically didn’t work anywhere near as well as some people expected. But I expect LM agents as a whole will be a huge deal.
- Bogdan Ionut Cirstea 15 Sep 2024 16:50 UTC
  4 points
  1
  Parent
  it’s complete bunk
  ignoring the huge errors in it
  I genuinely don’t know what you’re referring to.
  Fwiw, I’m linking to it because I think it’s the first/clearest demo of how the entire ML research workflow (e.g. see figure 1 in the arxiv) can plausibly be automated using LM agents, and they show a proof of concept which arguably already does something (in any case, it works better than I’d have expected it to). If you know of a better reference, I’d be happy to point to that instead/alternately. Similarly if you can ‘debunk it’ (I don’t think it’s been anywhere near debunked).
  - habryka 15 Sep 2024 17:10 UTC
    2 points
    0
    Parent
    We had this conversation two weeks ago?
    https://www.lesswrong.com/posts/rQDCQxuCRrrN4ujAe/jeremy-gillen-s-shortform?commentId=TXePXoEosJmAbMZSk
    - Bogdan Ionut Cirstea 15 Sep 2024 17:32 UTC
      −2 points
      0
      Parent
      I thought you meant the AI scientist paper has some obvious (e.g. methodological or code) flaws or errors. I find that thread unconvincing, but we’ve been over this.
      - dirk 23 Nov 2024 22:14 UTC
        1 point
        0
        Parent
        It doesn’t demonstrate automation of the entire workflow—you have to, for instance, tell it which topic to think of ideas about and seed it with examples—and also, the automated reviewer rejected the autogenerated papers. (Which, considering how sycophantic they tend to be, really reflects very negatively on paper quality, IMO.)