After thinking about it a little bit, the only hypothesis I could come up with for what’s going on in the negation example is that the smaller models understand the Q&A format and understand negation, but the larger models have learned that negation inside a Q&A is unusual and so disregard it.
After thinking about it a little bit, the only hypothesis I could come up with for what’s going on in the negation example is that the smaller models understand the Q&A format and understand negation, but the larger models have learned that negation inside a Q&A is unusual and so disregard it.