Erik Jenner comments on Internal Interfaces Are a High-Priority Interpretability Target

Erik Jenner 29 Dec 2022 22:26 UTC
8 points
4
I’m very interested in examples of non-modular systems, but I’m not convinced by this one, for multiple reasons:
- Even a 1,500 line function is a pretty small part of the entire codebase. So the existence of that function already means that the codebase as a whole seems somewhat modular.
- My guess is that the function itself is in fact also modular (in the way I’d use the term). I only glanced at the function you link very quickly, but one thing that jumped out are the comments that divide it into “Phase 1” to “Phase 5″. So even though it’s not explicitly decomposed into submodules in the sense of e.g. helper functions, it does seem that programmers find it a useful abstraction to think of this huge function as a composition of five “submodules” that perform different subtasks. I would guess that this abstraction is reflected somehow in the structure of the function itself, as opposed to being completely arbitrary (i.e. putting the boundaries between phases at random line numbers instead would be worse in some pretty objective sense). So to me, the existence of 1500 line functions is not strong evidence against the ubiquity of modularity, since modularity properly defined should be more general than just functions. I do agree this would be a good counterexample to certain definitions of modularity that are too narrow. (To be clear, I don’t think anyone has a good definition yet for how this “natural submodule structure” could be read off from a program.)
- Regarding the if statements: arguably, “truly non-modular” code would have lots of if statements that use a big fraction or even almost all of the variables in scope (or if we’re being strict, in the entire program, since smaller scopes already imply submodules). So I think if an if statement in a function with hundreds of variables contains just 4-8 terms depending on how we count, that’s not a lot.
I wouldn’t be surprised to learn we just have somewhat different notions of what “modularity” should mean. For sufficiently narrow definitions, I agree that lots of computations are non-modular, I’m just less interested in those.