Sure, casual use of categories is convenient and pretty good for a lot of purposes. [...] Where precision matters, though, you’re better off using more words. Don’t try to cram so much inferential power into a categorization that’s not a good fit for the domain of predictions you’re making.
So, I actually don’t think “casual” vs. “precise” is a good characterization of the distinction I was trying to make in the grandparent! I’m saying that for “sparse”, tightly-clustered distributions in high-dimensional spaces, something like “essentialism” is actually doing really useful cognitive work, and using more words to describe more basic, lower-level (“precise”?) features doesn’t actually get you better performance—it’s not just about minimizing cognitive load.
A good example might be the recognition of accents. Which description is more useful, both for your own thinking, and for communicating your observations to others—
At the level of consciousness, it’s much easier to correctly recognize accents than to characterize and articulate all the individual phoneme-level features that your brain is picking up on to make the categorization. Categories let you make inferences about hidden variables that you haven’t yet observed in a particular case, but which are known to correlate with features that you have observed. Once you hear the non-rhoticity in someone’s speech, your brain also knows how to anticipate how they’ll pronounce vowels that they haven’t yet said—and where the person grew up! I think this is a pretty impressive AI capability that shouldn’t be dismissed as “casual”!
Accents are a good example. It’s easy to offend someone or to make incorrect predictions based on “has a British accent”, when you really only know some patterns of pronunciation. In some contexts, that’s a fine compression; way easier to process, communicate and remember. In other contexts, you’re better off highlighting and acknowledging that your data supports many interpretations, and you should be preserve that uncertainty in your communication and predictions.
“casual” vs “precise” are themselves lossy compression of fuzzy concepts, and what I really mean is that the use of compression is valid and helpful sometimes, and harmful and misleading at other times. My point is that the distinction is _NOT_ primarily about how tight the cluster or how close the match to some dimensions of reality in the abstract. The acceptability of the compression is about context and uses for the compressed or less-compressed information, and whether the lost details are important for the purpose of the communication or prediction. It’s whether it meets the needs of the model, not how close it is to “reality”.
Note also that I recognize that no model and no communication is actually full-fidelity. Everything any agent knows is compressed and simplified from reality. The question is how much further compression is valuable for what purposes.
Essentialism is wrong. Conceptual compression and simplified modeling is always necessary, and sometimes even an extreme compaction is good enough for a purpose.
So, I actually don’t think “casual” vs. “precise” is a good characterization of the distinction I was trying to make in the grandparent! I’m saying that for “sparse”, tightly-clustered distributions in high-dimensional spaces, something like “essentialism” is actually doing really useful cognitive work, and using more words to describe more basic, lower-level (“precise”?) features doesn’t actually get you better performance—it’s not just about minimizing cognitive load.
A good example might be the recognition of accents. Which description is more useful, both for your own thinking, and for communicating your observations to others—
“She has a British accent”; or
“She only pronounces the phoneme /r/ when it is immediately followed by a vowel, and her speech has three different open back vowels, and …”?
At the level of consciousness, it’s much easier to correctly recognize accents than to characterize and articulate all the individual phoneme-level features that your brain is picking up on to make the categorization. Categories let you make inferences about hidden variables that you haven’t yet observed in a particular case, but which are known to correlate with features that you have observed. Once you hear the non-rhoticity in someone’s speech, your brain also knows how to anticipate how they’ll pronounce vowels that they haven’t yet said—and where the person grew up! I think this is a pretty impressive AI capability that shouldn’t be dismissed as “casual”!
Accents are a good example. It’s easy to offend someone or to make incorrect predictions based on “has a British accent”, when you really only know some patterns of pronunciation. In some contexts, that’s a fine compression; way easier to process, communicate and remember. In other contexts, you’re better off highlighting and acknowledging that your data supports many interpretations, and you should be preserve that uncertainty in your communication and predictions.
“casual” vs “precise” are themselves lossy compression of fuzzy concepts, and what I really mean is that the use of compression is valid and helpful sometimes, and harmful and misleading at other times. My point is that the distinction is _NOT_ primarily about how tight the cluster or how close the match to some dimensions of reality in the abstract. The acceptability of the compression is about context and uses for the compressed or less-compressed information, and whether the lost details are important for the purpose of the communication or prediction. It’s whether it meets the needs of the model, not how close it is to “reality”.
Note also that I recognize that no model and no communication is actually full-fidelity. Everything any agent knows is compressed and simplified from reality. The question is how much further compression is valuable for what purposes.
Essentialism is wrong. Conceptual compression and simplified modeling is always necessary, and sometimes even an extreme compaction is good enough for a purpose.