gwern comments on Jacob Pfau’s Shortform

gwern 16 Jun 2024 20:24 UTC
6 points
2

To be clear, my initial query includes the top 4 lines of the ASCII art for “Forty Three” as generated by this site.

I saw that, but it didn’t look like those were used literally. Go line by line: first, the spaces are different, even if the long/short underlines are preserved, so whitespace alone is being reinterpreted AFAICT. Then the second line of ‘forty three’ looks different in both spacing and content: you gave it pipe-underscore-pipe-pipe-underscore-underscore-pipe etc, and then it generates pipe-underscore-slash-slash-slash-slash-slash… Third line: same kind of issue, fourth, likewise. The slashes and pipes look almost random—at least, I can’t figure out what sort of escaping is supposed to be going on here, it’s rather confusing. (Maybe you should make more use of backtick code inputs so it’s clearer what you’re inputting.)

It’s possible (though unlikely given the diversity of ASCII formats) that BREAKFAST was memorized in precisely this ASCII font and formatting

Why do you think that’s unlikely at Internet-scale? You are using a free online tool which has been in operation for over 17* years (and seems reasonably well known and immediately show up for Google queries like ‘ascii art generator’ and to have inspired imitators) to generate these, instead of writing novel ASCII art by hand you can be sure is not in any scrapes. That seems like a recipe for output of that particular tool to be memorized by LLMs.

* I know, I’m surprised too. Kudos to Patrick Gillespie.
- Jacob Pfau 18 Jun 2024 0:55 UTC
  1 point
  0
  Parent
  The UI definitely messes with the visualization which I didn’t bother fixing on my end, I doubt tokenization is affected.
  
  You appear to be correct on ‘Breakfast’: googling ‘Breakfast’ ASCII art did yield a very similar text—which is surprising to me. I then tested 4o on distinguishing the ‘E’ and ‘F’ in ‘PREFAB’, because ‘PREF’ is much more likely than ‘PREE’ in English. 4o fails (producing PREE...). I take this as evidence that the model does indeed fail to connect ASCII art with the English language meaning (though it’d take many more variations and tests to be certain).
  
  In summary, my current view is:
  1. 4o generalizably learns the structure of ASCII letters
  2. 4o probably makes no connection between ASCII art texts and their English language semantics
  3. 4o can do some weak ICL over ASCII art patterns
  On the most interesting point (2) I have now updated towards your view, thanks for pushing back.
  - gwern 18 Jun 2024 1:21 UTC
    9 points
    2
    Parent
    ASCII art is tricky because there’s way more of it online than you think.
    
    I mean, this is generally true of everything, which is why evaluating LLM originality is tricky, but it’s especially true for ASCII art because it’s so compact, it goes back as many decades as computers do, and it can be generated in bulk by converters for all sorts of purposes (eg). You can stream over telnet ‘movies’ converted to ASCII and whatnot. Why did https://ascii.co.uk/art compile https://ascii.co.uk/art/breakfast ? Who knows. (There is one site I can’t refind right now which had thousands upon thousands of large ASCII art versions of every possible thing like random animals, far more than could have been done by hand, and too consistent in style to have been curated; I spent some time poking into it but I couldn’t figure out who was running it, or why, or where it came from, and I was left speculating that it was doing something like generating ASCII art versions of random Wikimedia Commons images. But regardless, now it may be in the scrapes. “I asked the new LLM to generate an ASCII swordfish, and it did. No one would just have a bunch of ASCII swordfishes on the Internet, so that can’t possibly be memorized!” Wrong.)
    
    But there’s so many you should assume it’s memorized: https://x.com/goodside/status/1784999238088155314
    
    Anyway, Claude-3 seems to do some interesting things with ASCII art which don’t look obviously memorized, so you might want to switch to that and try out Websim or talk to the Cyborgism people interested in text art.
    - Jacob Pfau 20 Jun 2024 14:54 UTC
      1 point
      0
      Parent
      Claude-3.5 Sonnet passes 2 out of 2 of my rare/multi-word ‘E’-vs-‘F’ disambiguation checks.
      I confirmed that ‘E’ and ‘F’ precisely match at a character level for the first few lines. It fails to verbalize.
      
      On the other hand, in my few interactions, Claude-3.0′s completion/verbalization abilities looked roughly matched.