Kei comments on You can use GPT-4 to create prompt injections against GPT-4

Kei 11 Apr 2023 3:49 UTC
3 points
0
I wonder if this is due to a second model that checks whether the output of the main model breaks any rules. The second model may not be smart enough to identify the rule breaking when you use a street name.
- LukasDay 12 Apr 2023 21:18 UTC
  1 point
  0
  Parent
  That’s what I was wondering also. Could also be as simple as a blacklist of known illegal substances that is checked against all prompts which is why common names are no-go but street names slip thru.