gwern comments on Gliders in Language Models

gwern 25 Nov 2022 17:50 UTC
8 points
0
More examples beyond CycleGAN:
- ‘non-robust features’ in image classification: they exist, and predict out of sample, but it’s difficult to say what they are
- stylometrics: in natural language analysis, author identification can be done well by looking at use of particle words like ‘the’ or ‘an’. We find it difficult to impossible to notice subtle changes in frequency of use of hundreds of common words, but statistical models can integrate them and identify authors in cases where humans fail.
- degenerate completions/the repetition trap: aaaaaaaaaaaaaaaaa -!
What links here?
- gwern's comment on By Default, GPTs Think In Plain Sight by Fabien Roger (30 Jan 2023 1:57 UTC; 113 points)
- janus 25 Nov 2022 18:31 UTC
  6 points
  0
  Parent
  Ah yes, aaaaaaaaaaaaaaaaa, the most agentic string
  - gwern 25 Nov 2022 20:17 UTC
    12 points
    9
    Parent
    You have to admit, in terms of the Eliezeresque definition of ‘agency/optimization power’ as ‘steering future states towards a small region of state-space’, aaa is the most agentic prompt of all! (aaaaaaaah -!)
    - Quintin Pope 25 Nov 2022 22:46 UTC
      8 points
      8
      Parent
      Now I want a “who would win” meme, with something like “agentic misaligned deceptive mesa optimizer scheming to take over the world” on the left side, and “one screamy boi” on the right.