Thane Ruthenis comments on GPT-o1

Thane Ruthenis 20 Sep 2024 4:45 UTC
3 points
0
They kinda have an underlying personality, in the sense that they have propensities (like comparing things to tapestries, or saying “let’s delve into”), but those propensities don’t reflect underlying wants any more than the RLHF persona does, IMO (and, rather importantly, there’s no sequence of prompts that will enable an LLM to freely choose its words)
I think the “LLM Whisperer” frame is that there’s no such thing as “underlying wants” in a base LLM model, that the base LLM model is just a volitionless simulator and the only “wants” there are are in the RLHF’d or prompt-engineered persona.
I likewise would bet that they’re wrong about this in the relevant sense: that whether or not this holds for the SoTA models, it won’t hold for any AGI-level model we’re on-track to get (though I think they might actually claim we already have “AGI-level” models?).
awfully leading prompts are not especially rare
Yeah, that’s an issue too.