I’m not entirely sure why, but I find it trivial to get GPT-4o to output harmful content in advanced voice mode (AVM), given that it doesn’t have a direct content filter on it e.g. NSFW.
In the span of 30 minutes it gave me 1) instructions to make a pipe bomb to attach to a car. And 2) Instructions on how to leak a virus into the public without detection (avoiding saying more on this).
I have a theory as to why it might be easy for me specifically. But I would like to know if this is the experience that others have with AVM?
Omg. Oops! I completely forgot about custom instructions and memory! I’ll run some more trials with those off.
Thank you very much for pointing this out.
I’m not entirely sure why, but I find it trivial to get GPT-4o to output harmful content in advanced voice mode (AVM), given that it doesn’t have a direct content filter on it e.g. NSFW.
In the span of 30 minutes it gave me 1) instructions to make a pipe bomb to attach to a car. And 2) Instructions on how to leak a virus into the public without detection (avoiding saying more on this).
I have a theory as to why it might be easy for me specifically. But I would like to know if this is the experience that others have with AVM?
I haven’t tried harmful outputs, but FWIW I’ve tried getting it to sing a few times and found that pretty difficult.
Hu. That is extremely useful. Thank you.
I’ve got a lot of singing out of AVM. While my current method works well for this, I find it more challenging than eliciting harmful outputs.
Did you use any specific prompt in memory or custom instructions?
Omg. Oops! I completely forgot about custom instructions and memory! I’ll run some more trials with those off. Thank you very much for pointing this out.