I actually did not realize they released the base model. There’s research showing how easy it is to remove the safety fine-tuning, which is where I got the framing and probably Zvi too, but perhaps that was more of a proof of concept than the main concern in this case.
The concept of being able to remove fine-tuning is pretty important for safety, but I will change my wording where possible to also mention it being bad to release the base model without any safety fine-tuning. Just asked to download llama 2 so I’ll see what options they give.
Here’s my comment with references where I attempted to correct Zvi’s framing. He probably didn’t notice it, since he used the framing again a couple of weeks later.
I actually did not realize they released the base model. There’s research showing how easy it is to remove the safety fine-tuning, which is where I got the framing and probably Zvi too, but perhaps that was more of a proof of concept than the main concern in this case.
The concept of being able to remove fine-tuning is pretty important for safety, but I will change my wording where possible to also mention it being bad to release the base model without any safety fine-tuning. Just asked to download llama 2 so I’ll see what options they give.
Here’s my comment with references where I attempted to correct Zvi’s framing. He probably didn’t notice it, since he used the framing again a couple of weeks later.