Clusters in thingspace have been bothering me. Or rather, EY’s discussion of them. What I want to do is rephrase the term as “clusters-according-to-cognitive-system-C.” Thingspace is high-dimensional, and which dimensions loom larger than others depends on the perceptual and motivational structure of the cognizer. In machine learning classification algorithms, it’s common to normalize each dimension by its mean and standard deviation, or by its extrema, but there’s no hard and fast rule. And besides, first one typically selects which dimensions to model.
If dimensions aren’t normalized, it matters how we scale them. Are two events separated by ten minutes and no spatial distance, closer or farther apart than two events separated by 200 km and one second? Well, it depends. Are we studying cosmology, or planning a vacation?
None of this invalidates EY’s cautions about the use of words. It just adds one more aspect to watch out for.
Clusters in thingspace have been bothering me. Or rather, EY’s discussion of them. What I want to do is rephrase the term as “clusters-according-to-cognitive-system-C.” Thingspace is high-dimensional, and which dimensions loom larger than others depends on the perceptual and motivational structure of the cognizer. In machine learning classification algorithms, it’s common to normalize each dimension by its mean and standard deviation, or by its extrema, but there’s no hard and fast rule. And besides, first one typically selects which dimensions to model.
If dimensions aren’t normalized, it matters how we scale them. Are two events separated by ten minutes and no spatial distance, closer or farther apart than two events separated by 200 km and one second? Well, it depends. Are we studying cosmology, or planning a vacation?
None of this invalidates EY’s cautions about the use of words. It just adds one more aspect to watch out for.