The fact that language models inevitably will end up contradicting themselves is due to the fact that they have finite memory. Asking them not to contradict themselves over sufficiently large amount of text is asking for the impossible: they’re figuring out what to output as the current token by looking at only the last n tokens, so if the contradicting fact lies further back than that there is no way for the model to update on that. And no increase in the size of the models, without fundamental architecture change, will fix that problem.
When I talk about self-contradiction in this post, I’m talking about the model contradicting itself in the span of a single context window. In other words, when the contradicting fact is “within the last n tokens.”
The fact that language models inevitably will end up contradicting themselves is due to the fact that they have finite memory. Asking them not to contradict themselves over sufficiently large amount of text is asking for the impossible: they’re figuring out what to output as the current token by looking at only the last n tokens, so if the contradicting fact lies further back than that there is no way for the model to update on that. And no increase in the size of the models, without fundamental architecture change, will fix that problem.
But architecture changes to deal with that problem might be coming...
When I talk about self-contradiction in this post, I’m talking about the model contradicting itself in the span of a single context window. In other words, when the contradicting fact is “within the last n tokens.”
Aha, thanks for clarifying this; was going to ask this too. :)