Garrett Baker comments on D0TheMath’s Shortform

Garrett Baker 23 Mar 2024 14:50 UTC
2 points
0

In LLM land, though not as drastic, we see similar things happening, in particular technqiues for merging models to get rapid capability advances, and rapid creation of new patterns for agent interactions and tool use.

The biggest effect open sourcing LLMs seems to have is improving safety techniques. Why think this differentially accelerates capabilities over safety?
- Matt Goldenberg 23 Mar 2024 15:02 UTC
  7 points
  1
  Parent
  it doesn’t seem like that’s the case to me—but even if it were the case, isn’t that moving the goal posts of the original post?
  I don’t think time-to-AGI got shortened at all.
  - Garrett Baker 24 Mar 2024 5:48 UTC
    2 points
    0
    Parent
    You are right, but I guess the thing I do actually care about here is the magnitude of the advancement (which is relevant for determining the sign of the action). How large an effect do you think the model merging stuff has (I’m thinking the effect where if you train a bunch of models, then average their weights, they do better). It seems very likely to me its essentially zero, but I do admit there’s a small negative tail that’s greater than the positive, so the average is likely negative.
    
    As for agent interactions, all the (useful) advances there seem things that definitely would have been made even if nobody released any LLMs, and everything was APIs.
    - Matt Goldenberg 24 Mar 2024 15:54 UTC
      2 points
      0
      Parent
      it’s true, but I don’t think there’s anything fundamental preventing the same sort of proliferation and advances in open source LLMs that we’ve seen in stable diffusion (aside from the fact that LLMs aren’t as useful for porn). that it has been relatively tame so far doesn’t change the basic pattern of how open source effects the growth of technology