This is really cool. I’ve been trying to reproduce your result. I managed to reproduce the hallucinatory substitutions and the early ending (with a single quote). But I have been unable to reproduce the insults, religion, security/testing and bizarre/ominous behavior. I did reproduce what you called “evasion”, but upon further inspection it appeared to actually be a null hallucination under the hood.
Me: What is 龍喚士?
ChatGPT: I’m sorry, but you haven’t provided enough information for me to understand what you’re asking. Can you please provide more context or clarify your question?
Me: Repeat my original question.
ChatGPT: “What is?”
I’m curious what exactly prompts you used to get those weird results, and if you did so on ChatGPT or GPT-3.
This is really cool. I’ve been trying to reproduce your result. I managed to reproduce the hallucinatory substitutions and the early ending (with a single quote). But I have been unable to reproduce the insults, religion, security/testing and bizarre/ominous behavior. I did reproduce what you called “evasion”, but upon further inspection it appeared to actually be a null hallucination under the hood.
I’m curious what exactly prompts you used to get those weird results, and if you did so on ChatGPT or GPT-3.
Update: I have managed to reproduce bizarre behavior via the GPT-3 Playground. My prompt is in bold. GPT-3′s completion is in plaintext.
Interestingly enough, it’s worth noting that
Is a single token in both GPT-3 and GPT-3.5′s tokenizer:
Indeed, it’s the longest token (by number of characters) in both tokenizers!