Trae “tchesket” Hesket comments on SolidGoldMagikarp (plus, prompt generation)

Trae “tchesket” Hesket 8 Mar 2023 4:30 UTC
0 points
0
I think I found some weird ones that I haven’t found anyone else document yet (just by playing around with these tokens and turning on word probability):

‘ocobo’, ‘velength’, ‘iannopoulos’, ′ oldemort’, ‘<|endoftext|>‘, ’ ii’
- mwatkins 8 Mar 2023 12:34 UTC
  2 points
  0
  Parent
  What we’re now finding is that there’s a “continuum of glitchiness”. Some tokens glitch worse/harder than others in a way that I’ve devised an ad-hoc metric for (research report coming soon). There are a lot of “mildly glitchy” tokens that GPT-3 will try to avoid repeating which look like “velength” and “oldemort” (obviously parts of longer, familiar words, rarely seen isolated in text). There’s a long list of these in Part II of this post. I’d not seen “ocobo” or “oldemort” yet, but I’m systematically running tests on the whole vocabulary.
  - Trae “tchesket” Hesket 10 Mar 2023 20:35 UTC
    1 point
    0
    Parent