I’d argue that the easiness of physics probably comes from the fact that we can get effectively unlimited data, combined with the ability to query our reality as an oracle to test certain ideas and importantly get easy verification of a theory, which helps in 2 ways:
The prior matters very little, because you can update to the right theory from almost all but the most dogmatic priors.
Verifying being easy shortcuts a lot of philosophical debates, and makes it easy to update towards correct theories.
However, I think the main insight that sloppy but directionally correct ideas being useful to build upon, combined with partial progress being important is a very important idea that has applicability beyond physics.
This makes sense, but I’d argue that ML and interpretability has even more of both of these properties. Something that makes it harder is that some of the high-level goals of understanding transformers are inherently pretty complex, and also it’s less susceptible to math/ elegance-based analysis, so is even more messy :)
I think what explains the relative ease of progress in physics has more so to do with its relative compositionality in contrast to other disciplines like biology or economics or the theory of differential equations, in the sense Jules Hedges meant it. To quote that essay:
For examples of non-compositional systems, we look to nature. Generally speaking, the reductionist methodology of science has difficulty with biology, where an understanding of one scale often does not translate to an understanding on a larger scale. … For example, the behaviour of neurons is well-understood, but groups of neurons are not. Similarly in genetics, individual genes can interact in complex ways that block understanding of genomes at a larger scale.
Such behaviour is not confined to biology, though. It is also present in economics: two well-understood markets can interact in complex and unexpected ways. Consider a simple but already important example from game theory. The behaviour of an individual player is fully understood: they choose in a way that maximises their utility. Put two such players together, however, and there are already problems with equilibrium selection, where the actual physical behaviour of the system is very hard to predict.
More generally, I claim that the opposite of compositionality is emergent effects. The common definition of emergence is a system being ‘more than the sum of its parts’, and so it is easy to see that such a system cannot be understood only in terms of its parts, i.e. it is not compositional. Moreover I claim that non-compositionality is a barrier to scientific understanding, because it breaks the reductionist methodology of always dividing a system into smaller components and translating explanations into lower levels.
More specifically, I claim that compositionality is strictly necessary for working at scale. In a non-compositional setting, a technique for a solving a problem may be of no use whatsoever for solving the problem one order of magnitude larger. To demonstrate that this worst case scenario can actually happen, consider the theory of differential equations: a technique that is known to be effective for some class of equations will usually be of no use for equations removed from that class by even a small modification. In some sense, differential equations is the ultimate non-compositional theory.
I’d argue that the easiness of physics probably comes from the fact that we can get effectively unlimited data, combined with the ability to query our reality as an oracle to test certain ideas and importantly get easy verification of a theory, which helps in 2 ways:
The prior matters very little, because you can update to the right theory from almost all but the most dogmatic priors.
Verifying being easy shortcuts a lot of philosophical debates, and makes it easy to update towards correct theories.
However, I think the main insight that sloppy but directionally correct ideas being useful to build upon, combined with partial progress being important is a very important idea that has applicability beyond physics.
This makes sense, but I’d argue that ML and interpretability has even more of both of these properties. Something that makes it harder is that some of the high-level goals of understanding transformers are inherently pretty complex, and also it’s less susceptible to math/ elegance-based analysis, so is even more messy :)
I think what explains the relative ease of progress in physics has more so to do with its relative compositionality in contrast to other disciplines like biology or economics or the theory of differential equations, in the sense Jules Hedges meant it. To quote that essay: