Jacob Pfau comments on Jacob Pfau’s Shortform

Jacob Pfau 14 Jun 2024 22:47 UTC
26 points
−10
An example of an elicitation failure: GPT-4o ‘knows’ what ASCII is being written, but cannot verbalize in tokens. [EDIT: this was probably wrong for 4o, but seems correct for Claude-3.5 Sonnet. See below thread for further experiments]

https://chatgpt.com/share/fa88de2b-e854-45f0-8837-a847b01334eb

4o fails to verbalize even given a length 25 sequence of examples (i.e. 25-shot prompt) https://chatgpt.com/share/ca9bba0f-c92c-42a1-921c-d34ebe0e5cc5
- gwern 15 Jun 2024 21:16 UTC
  35 points
  17
  Parent
  I don’t follow this example. You gave it some ASCII gibberish, which it ignored in favor of spitting out an obviously memorized piece of flawless hand-written ASCII art from the training dataset*, which had no relationship to your instructions and doesn’t look anything like your input; and then it didn’t know what that memorized ASCII art meant, because why would it? Most ASCII art doesn’t come with explanations or labels. So why would you expect it to answer ‘Forty Three’ instead of confabulating a guess (probably based on ‘Fo_T_’, as it recognizes a little but not all of it).
  
  I don’t see any evidence that it knows what is being written but cannot verbalize it, so this falls short of examples like in image-generator models: https://arxiv.org/abs/2311.00059
  
  * as is usually the case when you ask GPT-3/GPT-4/Claude for ASCII art, and led to some amusing mode collapse failure modes like GPT-3 generating a bodybuilder ASCII art
  - Jacob Pfau 16 Jun 2024 2:13 UTC
    14 points
    4
    Parent
    To be clear, my initial query includes the top 4 lines of the ASCII art for “Forty Three” as generated by this site.
    
    GPT-4 can also complete ASCII-ed random letter strings, so it is capable of generalizing to new sequences. Certainly, the model has generalizably learned ASCII typography.
    
    Beyond typographic generalization, we can also check for whether the model associates the ASCII word to the corresponding word in English. Eg can the model use English-language frequencies to disambiguate which full ASCII letter is most plausible given inputs where the top few lines do not map one-to-one with English letters. E.g. in the below font I believe, E is indistinguishable from F given only the first 4 lines. The model successfully writes ‘BREAKFAST’ instead of “BRFAFAST”. It’s possible (though unlikely given the diversity of ASCII formats) that BREAKFAST was memorized in precisely this ASCII font and formatting, . Anyway the degree to which the human-concept-word is represented latently in connection with the ascii-symbol-word is a matter of degree (for instance, layer-wise semantics would probably only be available in deeper layers when using ASCII). This chat includes another test which shows mixed results. One could look into this more!
    - gwern 16 Jun 2024 20:24 UTC
      6 points
      2
      Parent
      
      To be clear, my initial query includes the top 4 lines of the ASCII art for “Forty Three” as generated by this site.
      
      I saw that, but it didn’t look like those were used literally. Go line by line: first, the spaces are different, even if the long/short underlines are preserved, so whitespace alone is being reinterpreted AFAICT. Then the second line of ‘forty three’ looks different in both spacing and content: you gave it pipe-underscore-pipe-pipe-underscore-underscore-pipe etc, and then it generates pipe-underscore-slash-slash-slash-slash-slash… Third line: same kind of issue, fourth, likewise. The slashes and pipes look almost random—at least, I can’t figure out what sort of escaping is supposed to be going on here, it’s rather confusing. (Maybe you should make more use of backtick code inputs so it’s clearer what you’re inputting.)
      
      It’s possible (though unlikely given the diversity of ASCII formats) that BREAKFAST was memorized in precisely this ASCII font and formatting
      
      Why do you think that’s unlikely at Internet-scale? You are using a free online tool which has been in operation for over 17* years (and seems reasonably well known and immediately show up for Google queries like ‘ascii art generator’ and to have inspired imitators) to generate these, instead of writing novel ASCII art by hand you can be sure is not in any scrapes. That seems like a recipe for output of that particular tool to be memorized by LLMs.
      
      * I know, I’m surprised too. Kudos to Patrick Gillespie.
      - Jacob Pfau 18 Jun 2024 0:55 UTC
        1 point
        0
        Parent
        The UI definitely messes with the visualization which I didn’t bother fixing on my end, I doubt tokenization is affected.
        
        You appear to be correct on ‘Breakfast’: googling ‘Breakfast’ ASCII art did yield a very similar text—which is surprising to me. I then tested 4o on distinguishing the ‘E’ and ‘F’ in ‘PREFAB’, because ‘PREF’ is much more likely than ‘PREE’ in English. 4o fails (producing PREE...). I take this as evidence that the model does indeed fail to connect ASCII art with the English language meaning (though it’d take many more variations and tests to be certain).
        
        In summary, my current view is:
        
        4o generalizably learns the structure of ASCII letters
        4o probably makes no connection between ASCII art texts and their English language semantics
        4o can do some weak ICL over ASCII art patterns
        
        On the most interesting point (2) I have now updated towards your view, thanks for pushing back.
        gwern 18 Jun 2024 1:21 UTC
        9 points
        2
        Parent
        ASCII art is tricky because there’s way more of it online than you think.
        
        I mean, this is generally true of everything, which is why evaluating LLM originality is tricky, but it’s especially true for ASCII art because it’s so compact, it goes back as many decades as computers do, and it can be generated in bulk by converters for all sorts of purposes (eg). You can stream over telnet ‘movies’ converted to ASCII and whatnot. Why did https://ascii.co.uk/art compile https://ascii.co.uk/art/breakfast ? Who knows. (There is one site I can’t refind right now which had thousands upon thousands of large ASCII art versions of every possible thing like random animals, far more than could have been done by hand, and too consistent in style to have been curated; I spent some time poking into it but I couldn’t figure out who was running it, or why, or where it came from, and I was left speculating that it was doing something like generating ASCII art versions of random Wikimedia Commons images. But regardless, now it may be in the scrapes. “I asked the new LLM to generate an ASCII swordfish, and it did. No one would just have a bunch of ASCII swordfishes on the Internet, so that can’t possibly be memorized!” Wrong.)
        
        But there’s so many you should assume it’s memorized: https://x.com/goodside/status/1784999238088155314
        
        Anyway, Claude-3 seems to do some interesting things with ASCII art which don’t look obviously memorized, so you might want to switch to that and try out Websim or talk to the Cyborgism people interested in text art.
        Jacob Pfau 20 Jun 2024 14:54 UTC
        1 point
        0
        Parent
        Claude-3.5 Sonnet passes 2 out of 2 of my rare/multi-word ‘E’-vs-‘F’ disambiguation checks.
        I confirmed that ‘E’ and ‘F’ precisely match at a character level for the first few lines. It fails to verbalize.
        
        On the other hand, in my few interactions, Claude-3.0′s completion/verbalization abilities looked roughly matched.
    - Neel Nanda 16 Jun 2024 10:10 UTC
      2 points
      0
      Parent
      Why was the second line of your 43 ASCII full of slashes? At that site I see pipes (and indeed GPT4 generates pipes). I do find it interesting that GPT4 can generate the appropriate spacing on the first line though, autoregressively! And if it does systematically recover the same word as you put into the website, that’s pretty surprising and impressive
      - Jacob Pfau 16 Jun 2024 12:58 UTC
        3 points
        0
        Parent
        I’d guess matched underscores triggered italicization on that line.
        Neel Nanda 16 Jun 2024 18:44 UTC
        2 points
        0
        Parent
        Ah! That makes way more sense, thanks