One criterion you can use for allocating a latent is if you have a lot of evidence about that latent. This is basically an information-theoretic approach, and it works well for common, spatially large/highly internally redundant, and long-term persistent latents. Boulders would be a core example of a case where it works well.
If you had a fluid, I think these kinds of latents would show up as stuff like, chemical compositions/pollution, temperature/pressure, directional flows, liquid/gas state, etc..
However, when you are considering things like eddies, you are looking at rarer, more heterogenously structured, less-persistent things, and thus you have much less evidence about them, which would make an information-theoretic approach very slow to allocate latents for them. Unless it does so out of a hallucinatory accident, like “hey, this happens to look like <some totally unrelated phenomenon, e.g. a pencil-drawn circle>!”, but such hallucinatory accidents don’t seem important for the question raised in the OP.
So the problem is, if they are rare, internally heterogenous, and non-persistent, what is the sense in which they are a “real category”? And, like, I’m not super familiar with fluid dynamics in particular so I might be wrong here, but I’ve thought deeply about other high-skewness concepts, where I eventually concluded that magnitude is what matters.
For an informational approach, we don’t need that much internal redundancy or persistency. Even an eddy is plenty internally redundant and persistent: I can take a video of an eddy flow, cover up a random quarter of it, and have a pretty decent guess at what that covered chunk looks like by looking at its surroundings. Very roughly speaking, that’s all the redundancy we need. Similarly for persistency: I can take a video of an eddy, cut a random contiguous quarter of the frames, and have a pretty decent guess at what those middle frames look like. Very roughly speaking, that’s all the persistency we need.
I think that works for satisfying the natural latent criterion, but interoperable semantics has a skewness/metric requirement you haven’t addressed yet? Like if you show someone a video of an eddy and tell them this is what’s called an “eddy”, what lets them infer that the “eddy” is the eddy (and could be other eddies in other circumstances), rather than e.g. the “eddy” referring to the fluid or the video or the background or the center or whatever? You need some metric of importance, and the only such metric information theory has is evidence, but evidence is, as far as I can tell, not the right metric for this in general.
One criterion you can use for allocating a latent is if you have a lot of evidence about that latent. This is basically an information-theoretic approach, and it works well for common, spatially large/highly internally redundant, and long-term persistent latents. Boulders would be a core example of a case where it works well.
If you had a fluid, I think these kinds of latents would show up as stuff like, chemical compositions/pollution, temperature/pressure, directional flows, liquid/gas state, etc..
However, when you are considering things like eddies, you are looking at rarer, more heterogenously structured, less-persistent things, and thus you have much less evidence about them, which would make an information-theoretic approach very slow to allocate latents for them. Unless it does so out of a hallucinatory accident, like “hey, this happens to look like <some totally unrelated phenomenon, e.g. a pencil-drawn circle>!”, but such hallucinatory accidents don’t seem important for the question raised in the OP.
So the problem is, if they are rare, internally heterogenous, and non-persistent, what is the sense in which they are a “real category”? And, like, I’m not super familiar with fluid dynamics in particular so I might be wrong here, but I’ve thought deeply about other high-skewness concepts, where I eventually concluded that magnitude is what matters.
I see.
For an informational approach, we don’t need that much internal redundancy or persistency. Even an eddy is plenty internally redundant and persistent: I can take a video of an eddy flow, cover up a random quarter of it, and have a pretty decent guess at what that covered chunk looks like by looking at its surroundings. Very roughly speaking, that’s all the redundancy we need. Similarly for persistency: I can take a video of an eddy, cut a random contiguous quarter of the frames, and have a pretty decent guess at what those middle frames look like. Very roughly speaking, that’s all the persistency we need.
I think that works for satisfying the natural latent criterion, but interoperable semantics has a skewness/metric requirement you haven’t addressed yet? Like if you show someone a video of an eddy and tell them this is what’s called an “eddy”, what lets them infer that the “eddy” is the eddy (and could be other eddies in other circumstances), rather than e.g. the “eddy” referring to the fluid or the video or the background or the center or whatever? You need some metric of importance, and the only such metric information theory has is evidence, but evidence is, as far as I can tell, not the right metric for this in general.