(Warning: I might not be describing this well. And might be stupid.)
I feel like there’s an alternate perspective compared to what you’re doing, and I’m trying to understand why you don’t take that path. Something like: You’re taking the perspective that there is one Bayes net that we want to understand. And the alternative perspective is: there is a succession of “scenarios”, and each is its own Bayes net, and there are deep regularities connecting them—for example they all obey the laws of physics, there’s some relation between the “chairs” in one scenario and future scenarios, etc.
In the “multiple scenarios” perspective, we can frame everything in terms of building good generative models, that predict missing data from limited data, or predict the future from the past. It seems like “resampling” is unnecessary in this perspective; we can evaluate generative models by whether they work well in the next scenario. Or as another example, we would learn a generative model of a gear that involved rigid rotation plus small-amplitude vibration, and when we see something that seems to match that model, we would guess that the model is applicable here. Etc.
Then a claim about far-away information would look something like “the generative model is structured as a bunch of sub-items with coordinates, and these items have local interactions”. And then you can make a substantive claim: “This class of generative models is a very powerful model class, really good at making predictions given a fixed model complexity”. Is that claim true? In some senses yes, but we can also immediately see limitations, e.g. “it is nighttime” is a useful generative model ingredient that doesn’t really have coordinates, and likewise “illegal”, etc.
The “redundant information” framing still applies; if a comparatively simple generative model can make lots of correct predictions, then clearly there was redundant information in what was predicted.
Anyway, my question is: am I correct that we can define abstractions as “ingredients in generative models”, and if so, why don’t you like that approach? (Or is it equivalent to your approach??)
You can think of everything I’m doing as occurring in a “God’s eye” model. I expect that an agent embedded in this God’s-eye model will only be able to usefully measure natural abstractions within the model. So, shifting to the agent’s perspective, we could say “holding these abstractions fixed, what possible models are compatible with them?”. And that is indeed a direction I plan to go. But first, I want to get the nicest math I possibly can for computing the abstractions within a model, because the cleaner that is the cleaner I expect that computing possible models from the abstractions will be.
… that was kinda rambly, but I guess the summary is “building good generative problems is the inverse problem for the approach I’m currently focused on, and I expect that the cleaner this problem is solved the easier it will be to handle its inverse problem”.
(Warning: I might not be describing this well. And might be stupid.)
I feel like there’s an alternate perspective compared to what you’re doing, and I’m trying to understand why you don’t take that path. Something like: You’re taking the perspective that there is one Bayes net that we want to understand. And the alternative perspective is: there is a succession of “scenarios”, and each is its own Bayes net, and there are deep regularities connecting them—for example they all obey the laws of physics, there’s some relation between the “chairs” in one scenario and future scenarios, etc.
In the “multiple scenarios” perspective, we can frame everything in terms of building good generative models, that predict missing data from limited data, or predict the future from the past. It seems like “resampling” is unnecessary in this perspective; we can evaluate generative models by whether they work well in the next scenario. Or as another example, we would learn a generative model of a gear that involved rigid rotation plus small-amplitude vibration, and when we see something that seems to match that model, we would guess that the model is applicable here. Etc.
Then a claim about far-away information would look something like “the generative model is structured as a bunch of sub-items with coordinates, and these items have local interactions”. And then you can make a substantive claim: “This class of generative models is a very powerful model class, really good at making predictions given a fixed model complexity”. Is that claim true? In some senses yes, but we can also immediately see limitations, e.g. “it is nighttime” is a useful generative model ingredient that doesn’t really have coordinates, and likewise “illegal”, etc.
The “redundant information” framing still applies; if a comparatively simple generative model can make lots of correct predictions, then clearly there was redundant information in what was predicted.
Anyway, my question is: am I correct that we can define abstractions as “ingredients in generative models”, and if so, why don’t you like that approach? (Or is it equivalent to your approach??)
You can think of everything I’m doing as occurring in a “God’s eye” model. I expect that an agent embedded in this God’s-eye model will only be able to usefully measure natural abstractions within the model. So, shifting to the agent’s perspective, we could say “holding these abstractions fixed, what possible models are compatible with them?”. And that is indeed a direction I plan to go. But first, I want to get the nicest math I possibly can for computing the abstractions within a model, because the cleaner that is the cleaner I expect that computing possible models from the abstractions will be.
… that was kinda rambly, but I guess the summary is “building good generative problems is the inverse problem for the approach I’m currently focused on, and I expect that the cleaner this problem is solved the easier it will be to handle its inverse problem”.