GPT-4 may beat most humans on a variety of challenging exams (page 5 of the GPT-4 paper), but still can’t reliably count the number of words in a sentence.
Should we even think that the number of words is an objective property of a linguistic system (at least in some cases)? It seems to me that there are grounds to doubt that based on how languages work.
It still fails to predict our answers, regardless I suppose.
Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined”
I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.
One small question:
Should we even think that the number of words is an objective property of a linguistic system (at least in some cases)? It seems to me that there are grounds to doubt that based on how languages work.
It still fails to predict our answers, regardless I suppose.
It’s pretty easily definable in English, at least in special cases, and my understanding is that GPT-4 fails in those cases.
(I suppose you know this)
Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined” I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.