I think it’s important to put more effort into tracking such definitional issues, though. People end up overstating things because they round off their interlocutors’ viewpoint to their own. For instance if person C asks “is it safe to scale generative language pre-training and ChatGPT-style DPO arbitrarily far?”, when person D then rounds this off to “is it safe to make transformer-based LLMs as powerful as possible?” and explains that “no, because instrumental convergence and compression priors”, this is probably just false for the original meaning of the statement.
If this repeatedly happens to the point of generating a consensus for the false claim, then that can push the alignment community severely off track.
I think it’s important to put more effort into tracking such definitional issues, though. People end up overstating things because they round off their interlocutors’ viewpoint to their own. For instance if person C asks “is it safe to scale generative language pre-training and ChatGPT-style DPO arbitrarily far?”, when person D then rounds this off to “is it safe to make transformer-based LLMs as powerful as possible?” and explains that “no, because instrumental convergence and compression priors”, this is probably just false for the original meaning of the statement.
If this repeatedly happens to the point of generating a consensus for the false claim, then that can push the alignment community severely off track.