gwern comments on LLMs are (mostly) not helped by filler tokens

gwern 11 Aug 2023 21:13 UTC
6 points
3

But I wouldn’t index on his results too much because I’m unsure how replicable they are.

OK. If Oam really was using a single IC token and getting benefits, then I think that would point to the forward-pass/additional-model-parameters-activated explanation: obviously, his model is unaffected by RLHF, MoEs, model scaling, or anything of the alternate hypotheses. He should probably look into that if he can replicate padding benefits. It would not necessarily resolve the question for larger models, but at least it’d establish such an effect exists in some model implementations.

Claude with RLAIF (which is basically the same)

I don’t think RLAIF is ‘basically the same’. I find Claude to be qualitatively highly different from RLHF’d models—as I’ve commented before, the RLHF seems to cause some rather perverse consequences that RLAIF avoids. For example, you cannot ask ChatGPT to “Write a non-rhyming poem.”—I’ve tried this prompt dozens of times, and it always writes a rhyming poem, even if you explicitly then tell it it incorrectly wrote a rhyming poem and to try again, while Claude nails it without trouble.

RLHF seems to distort the entire distribution of outputs to be on-policy and suck all outputs into the vortex of RLHF’d-pablum, while RLAIF seems to be far more selective and mostly affect directly-morality-relevant outputs, leaving neutral stuff like ‘write a poem’ undistorted.

My main pushback here is I’d be pretty surprised if gradient descent so inefficiently routed to the wrong experts on normal math problems that I would see a 10% improvement with a distribution shift.

/shrug. You are looking at rather different problems here, aren’t you? I mean, GSM8K problems are complex word problems, and not like the other problems you are benchmarking like ‘Addition’ or ‘GCD’.
- Owain_Evans 12 Aug 2023 21:06 UTC
  3 points
  2
  Parent
  ChatGPT-4 seems to have improved at diverse literary styles. It sometimes ignores the “non-rhyming” instructions, but I was able to get it to avoid rhyme on my second try by first asking it, “Can you write poems that don’t rhyme?”.
  https://chat.openai.com/share/698343c1-764e-4a65-9eb8-f2ec4e40da1b
- Violet Hour 11 Aug 2023 21:58 UTC
  3 points
  0
  Parent
  Minor point, but I asked the code interpreter to produce a non-rhyming poem, and it managed to do so on the second time of asking. I restricted it to three verses because it stared off well on my initial attempt, but veered into rhyming territory in later verses.
  - gwern 13 Aug 2023 1:39 UTC
    6 points
    0
    Parent
    
    I asked the code interpreter to produce a non-rhyming poem
    
    FWIW, all of the examples by me or others were either the Playground or chat interface. I haven’t subscribed so I don’t have access to the code interpreter.
    
    but veered into rhyming territory in later verses.
    
    Yep, sucked into the memorized-rhymes vortex. I’m glad to hear it now works sometimes, well, at least partially, if you don’t give it too long to go back on-policy & ignore its prompt. (Maybe all of the times I flagged the rhyming completions actually helped out a bit.)
    
    As a quick test of ‘forcing it out of distribution’ (per Herb’s comment), I tried writing in Playground with gpt-3.5-turbo “Write a non-rhyming poem.” with a prefix consisting of about 30 lines of “a a a a a a a a a” repeated, and without.
    
    Without the prefix, I only get ¹⁄₆ non-rhyming poems (ie. ⁵⁄₆ clearly rhymed); with the prefix, I get ⁴⁄₅ non-rhyming poem (1 did rhyme anyway).
    
    Might be something interesting there?