Kshitij Sachan comments on LLMs are (mostly) not helped by filler tokens

Kshitij Sachan 10 Aug 2023 6:20 UTC
2 points
0
huh interesting! Who else has also run filler token experiments?
I was also interested in this experiment because it seemed like a crude way to measure how non-myopic are LLMs (i.e. what fraction of the forward pass is devoted to current vs future tokens). I wonder if other people were mostly coming at it from that angle.
- Jacob Pfau 10 Aug 2023 20:28 UTC
  1 point
  1
  Parent
  I’m currently working on filler token training experiments in small models. These GPT-4 results are cool! I’d be interested to chat.
  - Kshitij Sachan 10 Aug 2023 22:34 UTC
    1 point
    0
    Parent
    Neat! I’ll reach out