Jeremy Gillen comments on Testing The Natural Abstraction Hypothesis: Project Intro

Jeremy Gillen 14 Oct 2022 16:55 UTC
7 points
I haven’t read every natural abstraction post yet, but I’m wondering whether this is a useful frame:
The relevant inductive bias for algorithms that learn natural abstractions might be to minimize expected working memory use (simultaneously with model complexity). This means the model will create labels for concepts that appear more frequently in the data distribution, with the optimal label length smaller for more commonly useful concepts.
In a prior over computable hypotheses, the hypotheses should be ordered by K-complexity(h) + AverageMemoryUsageOverRuntime(h).
I think this gives us the properties we want:
- The hypothesis doesn’t compute details when they are irrelevant to its predictions.
  - The most memory efficient way to simulate the output of a gearbox uses some representation equivalent to the natural summary statistics. But if the system has to predict the atomic detail of a gear, it will do the low level simulation.
- There exists a simple function from model-state to any natural concept.
  - Common abstract concepts have a short description length, and need to be used by the (low K-complexity) hypothesis program.
- Most real world models approximate this prior, by having some kind of memory bottleneck. The more closely an algorithm approximates this prior, the more “natural” the set of concepts it learns.
- johnswentworth 21 Nov 2022 22:39 UTC
  6 points
  Parent
  On my current understanding, this is true but more general; the natural abstraction hypothesis makes narrower predictions than that.