Jailbreaking Chat-GPTs won’t work the same as with text-completion GPTs. The ones fine-tuned for chatting have tokens for delineating user and assistant. I’m surprised the Chad McCool thing worked.
“The assistant’s response to the prompt will then be returned below the <|im_start|>assistant token and will end with <|im_end|> denoting that the assistant has finished its response.”[1] (Microsoft’s Chat-GPT docs)
ChatGPT filters out any text that resembles <|blahblah|> inside user prompt. Also the <|im_start|>,<|im_sep|>, and <|im_end|> tokens are completely out of user’s control. It’s simply impossible for us ChatGPT users to arbitrarily inject them.
Jailbreaking Chat-GPTs won’t work the same as with text-completion GPTs. The ones fine-tuned for chatting have tokens for delineating
user
andassistant
. I’m surprised the Chad McCool thing worked.I haven’t tried saying
<|im_end|>
to Chat-GPT, but I’m certain they’ve thought of that. Also worried about trying jic I get banned.ChatGPT filters out any text that resembles <|blahblah|> inside user prompt. Also the <|im_start|>,<|im_sep|>, and <|im_end|> tokens are completely out of user’s control. It’s simply impossible for us ChatGPT users to arbitrarily inject them.