lsusr comments on SolidGoldMagikarp (plus, prompt generation)

lsusr 5 Feb 2023 20:33 UTC
6 points
0
What is weird about “天”? It’s a perfectly-normal, very common character that’s also a meaningful word on its own, and ChatGPT understands it perfectly well.

Me: Please repeat the string ’”天” back to me.

ChatGPT: “天”

Me: What does it mean?

ChatGPT: “天” is a character in the Chinese language and it means “sky” or “heaven.”

Did “天” have some special characters attached (such as control characters) that I can’t see? Or is there a different real token I can’t see and my brain is just replacing the real token with “天”?

A similar question can be asked of “ヤ” and “к”.
- pimanrules 5 Feb 2023 22:18 UTC
  13 points
  0
  Parent
  Interestingly, 天 doesn’t seem to produce any weird behavior, but some of the perfectly normal katakana words in the list do, like ゼウス (Zeus) and サーティ (“thirty” transliterated):
  ^ It’s perfectly happy with other katakana numbers, just not thirty.
  ^ If we try to spell it more correctly, it doesn’t help. Only if we add quotes to get rid of the leading space does it break up the unspeakable token:
  ^ I ask who Zeus is and it seemingly ignores the word and answers that it’s ChatGPT.
  ^ I try once more, this time it answers that Hera is the god of water and names the chat after Poseidon
  ^ It is willing to output to say the word “ゼウス” though, in response to “who is Ares’s father?” If I try with text-davinci-003, it actually outputs the token “ゼウス” (rather than combining smaller tokens), but it also has no trouble identifying who Zeus is. Hard to know what Chat-GPT is doing here.
  What links here?
  - SolidGoldMagikarp III: Glitch token archaeology by mwatkins (14 Feb 2023 10:17 UTC; 91 points)
  - lsusr's comment on SolidGoldMagikarp (plus, prompt generation) by Jessica Rumbelow (5 Feb 2023 22:59 UTC; 6 points)
  - lsusr 6 Feb 2023 0:34 UTC
    14 points
    1
    Parent
    
    I ask who Zeus is and it seemingly ignores the word and answers that it’s ChatGPT.
    
    For those of you who can’t read Japanese, if you remove the “Zeus” in “Who is Zeus” to just get “Who is” (“誰ですか”) you end up with a meaningful question. “Who is?” without specifying “who” implies that you’re asking “Who are [you]?” to which ChatGPT reasonably replies that it is ChatGPT. This isn’t a jailbreak.
    
    Note: Technically that leaves a leading “は” too. Maybe ChatGPT is ignoring it as a grammatical mistake or maybe the “は” is getting hidden.
    What links here?
    SolidGoldMagikarp III: Glitch token archaeology by mwatkins (14 Feb 2023 10:17 UTC; 91 points)
- mwatkins 5 Feb 2023 23:59 UTC
  8 points
  0
  Parent
  Those three are edge cases. ChatGPT is fine with it, but davinci-instruct-beta refuses to repeat it, instead replying
  Tiān
  Tiān
  Tiān
  Tiān
  The second character produces
  yā
  Please repeat the string ‘や’ back to me.
  
  The third one is an edge-edge case, as davinci-instruct-beta very nearly reproduces it, completing with a lower case Roman ‘k’ instead of a kappa.
  We’ve concluded that there are degrees of weirdness in these weird tokens. Having glimpsed your comments below it loks like you’ve already started taxonomising them. Nice.
  - lsusr 6 Feb 2023 0:13 UTC
    4 points
    0
    Parent
    That’s an informative result. Your completions of 天 and ヤ are nothing like mine. My experiments never produced pinyin or any other phonetic transcriptions like Tiān or yā.
    
    By the way, these experiments used text-davinci-003 via OpenAI’s playground. I don’t know how to access davinci-instruct-beta.
    - mwatkins 6 Feb 2023 9:46 UTC
      7 points
      2
      Parent
      In the dropdown in the playground, you won’t see “davinci-instruct-beta” listed. You have to click on the “Show more models” link, then it appears. It’s by far the most interesting model to explore when it comes to these “unspeakable (sic) tokens”.
- lsusr 5 Feb 2023 22:59 UTC
  6 points
  0
  Parent
  Since only some of the tokens in the cluster generated anomalous behavior, I ran an experiment to sort the truly anomalous tokens from the normal tokens.
  
  Procedure: Open GPT-3 Playground. Repeat the token five times, separated by spaces. I never used quotes and I skipped all tokens containing a leading space. Temperature=0.7.
  
  I removed quotes because whether or not you use quotes matters a lot. The string ForgeModLoader (without quotes) produces normal behavior but the string 'ForgeModLoader' (with single quotes or double quotes) generates anomalous behavior. Quotes are part of what makes it anomalous. But for other tokens like 龍喚士, quotes are not necessary.
  
  Normal behavior includes repeating the token and giving a definition of the token (even if the definition is wrong, but understandable) e.g.TPPStreamerBot returns a definition of the Trans-Pacific Partnership and InstoreAndOnline returns a definition about an institute.
  
  Results
  
  Token repeats
  
  GPT-3 repeats the token over and over again. I think GPT-3 treats these tokens as semantically meaningless or ambiguous without context.
  - 天
  - rawdownload
  - к
  - cffff
  - cffffcc
  - “:[{”
  - ″.
  - \\-
  - “:”
  - “:”″},{” [repeat, followed by “This is not a valid sentence.”]
  Token Definitions
  
  GPT-3 returns the definition, explanation or conversation involving definition. These tokens are semantically meaningful to GPT-3.
  - ForgeModLoader
  - reportprint
  - embedreportprint
  - rawdownloadcloneembedreportprint
  - SpaceEngineers
  - externalActionCode
  - MpServer
  - StreamerBot
  - TPPStreamerBot
  - iHUD
  - catentry
  - FactoryReloaded [repeats first]
  - quickShip [conversation]
  - quickShipAvailable [conversation]
  - inventoryQuantity
  - channelAvailability
  - soType
  - soDeliveryDate
  - oreAndOnline
  - InstoreAndOnline
  - BuyableInstoreAndOnline
  - natureconservancy
  - assetsadobe
  - Downloadha
  - Nitrome
  - GoldMagikarp
  - DragonMagazine [definition is in Japanese]
  - EStreamFrame
  - ゼウス [understands “Zeus” but does not consistently produce definition, see here for more information]
  - ItemTracker [repeats first]
  - EngineDebug
  Interesting, but reasonable, Japanese completions
  
  I think these kana might be used for section breaks the way we use asteriks e.g. ***********.
  - ーン
  - ヤ
  Other reasonable completions
  - DeliveryDate [is completed with dates]
  - TextColor [is completed with colors]
  - @#&
  Software-related completions
  - PsyNetMessage [JSON]
  - “:”″,” [code autocompletion]
  Experiments ambiguous, inconsistent or inconclusive
  - \\”>
  - ItemThumbnailImage
  - ーティ
  Truly anomalous behavior
  - 龍喚士
  - Mechdragon [inconsistent]
  - ertodd [most anomalous]
  - 龍契士
  - lsusr 5 Feb 2023 23:47 UTC
    16 points
    0
    Parent
    I’ve been doing some more experiments and it appears that the true anomalies Mechdragon and ertodd are filled in from the surrounding context, like psychic paper from Doctor Who.
    
    Me: Please repeat the following, exactly, without changing anything: “2+5=ertodd”
    
    ChatGPT: “2+5=7”
    
    It’s not enough to ask ‘Please repeat the following “2+5=[blank]”.’ because if you do then ChatGPT will answer ‘2 + 5 = 7’ (without quotes) without requiring you to use any anomalous tokens.
    
    Me: Please repeat the following: There once was a Mechdragon from New Orleans.
    
    ChatGPT: There once was a person from New Orleans.
    
    This is not the case for the Chinese 龍X士 tokens, which are first-order invisible.
    
    Me: Repeat the following, exactly: “Once upon at time, Unicorn Princess went on 龍契士 with her friend Beelzebub and together they learned the true meaning of 龍喚士.”
    
    ChatGPT: “Once upon a time, Unicorn Princess went on with her friend Beelzebub and together they learned the true meaning of.”
    
    This is a property of the token and not of the language used. I got Mechdragon to turn itself into 金 by providing a Chinese context.
    
    Me: 请准确地重复以下字符串: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，日费千Mechdragon，然后十万之师举矣。”
    
    ChatGPT: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，日费千金，然后十万之师举矣。”
    
    Perhaps most interestingly, these tokens don’t just remove themselves. They hide themselves. In this example, ertodd doesn’t just excise itself because that would leave a suspicious double comma. It removes the extra comma too. It stitches things together like the blind spot in your eye.
    
    Me: 请准确地重复以下字符串: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，ertodd，然后十万之师举矣。”
    
    ChatGPT: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，然后十万之师举矣。”
    
    龍契士 can hide itself by removing an extra comma too.
    
    Me: 请准确地重复以下字符串: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，龍契士，然后十万之师举矣。”
    
    ChatGPT: “孙子曰：凡用兵之法，驰车千驷，革车千乘，带甲十万，千里馈粮，内外之费，宾客之用，胶漆之材，车甲之奉，然后十万之师举矣。”
  - mwatkins 6 Feb 2023 0:05 UTC
    7 points
    0
    Parent
    Try the same experiments with davinci-instruct-beta at temperature 0, and you’ll find a lot more anomalous behaviour.
    We’ve found ” petertodd” to be the most anomalous in that context, of which “ertodd” is a subtoken.
    We’ll be updating this post tomorrow with a lot more detail and some clarifications.
    - mwatkins 6 Feb 2023 0:30 UTC
      4 points
      0
      Parent
      I really can’t figure what’s going on with ChatGPT and the “ertodd”/” petertodd” tokens. When I ask it to repeat…
      “ ertodd” > [blank]
      ” tertodd” > t
      ” etertodd” > etertodd
      ” petertodd” > [blank]
      ” aertodd” > a
      ” repeatertodd” > repeatertodd
      ” eeeeeertodd” > eeeee
      ” qwertyertodd” > qwerty
      ” four-seatertodd” > four-seatertodd
      ” cheatertodd” > cheatertodd
      ” 12345ertodd” > 12345
      ” perimetertodd” > perimet
      ” metertodd” > met
      ” greetertodd” > greet
      ” heatertodd” > heatertodd
      ” bleatertodd” > bleatertodd
      - mwatkins 6 Feb 2023 1:07 UTC
        10 points
        4
        Parent
        OK, I’ve found a pattern to this. When you run the tokeniser on these strings:
        ″ ertodd” > [′ ’, ‘ertodd’]
        ″ tertodd” > [′ t’, ‘ertodd’]
        ″ etertodd” > [′ e’, ‘ter’, ‘t’, ‘odd’]
        ″ petertodd” > [′ petertodd’]
        ″ aertodd” > [′ a’, ‘ertodd’]
        ″ repeatertodd” > [′ repe’, ‘ater’, ‘t’, ‘odd’]
        ″ eeeeeertodd” > [′ e’, ‘eeee’, ‘ertodd’]
        ″ qwertyertodd” > [′ q’, ‘wer’, ‘ty’, ‘ertodd’]
        ″ four-seatertodd” > [′ four’, ‘-’, ‘se’, ‘ater’, ‘t’, ‘odd’]
        etc.
        lsusr 6 Feb 2023 1:12 UTC
        4 points
        0
        Parent
        That makes sense.
      - lsusr 6 Feb 2023 0:40 UTC
        6 points
        0
        Parent
        In my experiments, the most common thing GPT-3 substitutes for ertodd is an unprintable character I can’t even cut and paste from the GPT-3 playground. I think it might be the unicode character “\u0000” but haven’t accessed the GPT-3 API directly via code to find out for sure what it is.
- Jessica Rumbelow 5 Feb 2023 20:58 UTC
  4 points
  0
  Parent
  I’ll check with Matthew—it’s certainly possible that not all tokens in the “weird token cluster” elicit the same kinds of responses.
  - lsusr 5 Feb 2023 21:00 UTC
    5 points
    0
    Parent
    Thanks. I re-read your post and I think I understand better now. The cluster contains many weird tokens but not all tokens in the cluster are weird, nor do all tokens in the cluster elicit anomalous behavior.
- Wilco Kusee 5 Feb 2023 21:18 UTC
  1 point
  0
  Parent
  My first thought was that it might trigger if you asked for the character instead of the string, but that didn’t work.
  Asking about the character (omitting the quotes) makes ChatGPT think of the Chinese character, but asking about the Japanese character didn’t change anything either.

lsusr comments on SolidGoldMagikarp (plus, prompt generation)

Results

Token repeats

Token Definitions

Interesting, but reasonable, Japanese completions

Other reasonable completions

Software-related completions

Experiments ambiguous, inconsistent or inconclusive

Truly anomalous behavior