nearly-all the informational work done in a toddler’s mind of figuring out which pattern is referred to by the word “apple” must be performed by priors and general observations of the world, not by examples of apples specifically.
Mildly related: Most image/sound/etc. lossy compression algorithms (and that is what an abstraction is, a form of lossy data compression) are based on the Discrete Cosine Transform. Do you think that the brain does something like the DCT when relating visible apples to the concept of apple?
The cortex uses traveling waves of activity that help it organize concepts in space and time. In other words, the locally traveling waves provide an inductive bias for treating features that occur close together in space and time as part of the same object or concept. As a result, cortical space ends up mapping out conceptual space, in addition to retinotopic, somatic, or auditory space.
This is kind of like DCT in the sense that oscillations are used as a scaffold for storing or reconstructing information. I think that Neural Radiance Fields (NeRF) use a similar concept, using positional encoding (3D coordinates plus viewing angle, rather than 2D pixel position) to generate images, especially when the positional encoding uses Fourier features. Of course, Transformers also use such sinusoidal positional encodings to help with natural language understanding.
All that is to say that I agree with you. Something similar to DCT will probably be very useful for discovering natural abstractions. For one thing, I imagine that these sorts of approaches could help overcome texture bias in DNNs by incorporating more large-scale shape information.
Mildly related: Most image/sound/etc. lossy compression algorithms (and that is what an abstraction is, a form of lossy data compression) are based on the Discrete Cosine Transform. Do you think that the brain does something like the DCT when relating visible apples to the concept of apple?
The cortex uses traveling waves of activity that help it organize concepts in space and time. In other words, the locally traveling waves provide an inductive bias for treating features that occur close together in space and time as part of the same object or concept. As a result, cortical space ends up mapping out conceptual space, in addition to retinotopic, somatic, or auditory space.
This is kind of like DCT in the sense that oscillations are used as a scaffold for storing or reconstructing information. I think that Neural Radiance Fields (NeRF) use a similar concept, using positional encoding (3D coordinates plus viewing angle, rather than 2D pixel position) to generate images, especially when the positional encoding uses Fourier features. Of course, Transformers also use such sinusoidal positional encodings to help with natural language understanding.
All that is to say that I agree with you. Something similar to DCT will probably be very useful for discovering natural abstractions. For one thing, I imagine that these sorts of approaches could help overcome texture bias in DNNs by incorporating more large-scale shape information.
Thanks! Your links led me down some interesting avenues.