My current best guess for why base models refuse so much is that “Sorry, I can’t help with that. I don’t know how to” is actually extremely common on the internet, based on discussion with Achyuta Rajaram on twitter: https://x.com/ArthurConmy/status/1840514842098106527
This fits with our observations about how frequently LLaMA-1 performs incompetent refusal
My current best guess for why base models refuse so much is that “Sorry, I can’t help with that. I don’t know how to” is actually extremely common on the internet, based on discussion with Achyuta Rajaram on twitter: https://x.com/ArthurConmy/status/1840514842098106527
This fits with our observations about how frequently LLaMA-1 performs incompetent refusal