Question/Remark 2: AFAICT, your theory has a major missing piece, which is, proving that “abstraction” (formalized according to your way of formalizing it) of is actually a crucial ingredient of learning/cognition. The way I see it, such a proof should be by demonstrating that hypothesis classes defined in terms of probabilistic graph models / abstraction hierarchies can be learned with good sample complexity (and better yet if you can tell something about the computational complexity), in a manner that cannot be achieved if you discard any of the important-according-to-you pieces. You might have some different approach to this, but I’m not sure what it is.
Doesn’t the necessity of abstraction follow from size concerns? The alternative to abstraction would be to measure and simulate everything in full detail, which can only be done if you are “exponentially bigger than the universe” (and have exponentially many universes to learn from).
Doesn’t the necessity of abstraction follow from size concerns? The alternative to abstraction would be to measure and simulate everything in full detail, which can only be done if you are “exponentially bigger than the universe” (and have exponentially many universes to learn from).
One could argue that some kind of abstraction is necessary due to size concerns, but that alone does not necessarily nail down my whole formalism.