I’ve observed the same while fine tuning the latest OpenAI chat model, GPT-3.5. It’s very bad. The Da Vinci model has no protections in place whatsoever. I plan to work on an open-source solution for this issue over the next few weeks. If I make any improvements to the alignment of my models, I’ll update here or post it on the forum!
I’ve observed the same while fine tuning the latest OpenAI chat model, GPT-3.5. It’s very bad. The Da Vinci model has no protections in place whatsoever.
I plan to work on an open-source solution for this issue over the next few weeks. If I make any improvements to the alignment of my models, I’ll update here or post it on the forum!
There is a paper out on the exact phenomenon you noticed:
https://arxiv.org/abs/2310.03693