Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined”
I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.
It’s pretty easily definable in English, at least in special cases, and my understanding is that GPT-4 fails in those cases.
(I suppose you know this)
Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined” I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.