qbolec comments on Internal Interfaces Are a High-Priority Interpretability Target

qbolec Dec 29, 2022, 9:45 PM
3 points
0
ML models, like all software, and like the NAH would predict, must consist of several specialized “modules”.
After reading source code of MySQL InnoDB for 5 years, I doubt it. I think it is perfectly possible—and actually, what I would expect to happen by default—to have a huge working software, with no clear module boundaries.
Take a look at this case in point: the row_search_mvcc() function https://github.com/mysql/mysql-server/blob/8.0/storage/innobase/row/row0sel.cc#L4377-L6019 which has 1500+ lines of code and references hundreds of variables. This function is in called in the inner loop of almost every SELECT query you run, so on the one hand it probably works quite correctly, on the other was subject to “optimization pressure” over 20 years, and this is what you get. I think this is because Evolution is not Intelligent Design and it simply uses the cheapest locally available hack to get things done, and that is usually to reuse the existing variable#135, or more realistically combination of variables#135 and #167 to do the trick—see how many of the if statements have conditions which use more than a single atom, for example:
```
      if (set_also_gap_locks && !trx->skip_gap_locks() &&
          prebuilt->select_lock_type != LOCK_NONE &&
          !dict_index_is_spatial(index)) {
```
(Speculation: I suspect that unless you chisel your neural network architecture to explicitly disallow connecting a neuron in question directly to neuron#145 and #167, it will, as soon as it discovers they provide useful bits of information. I suspect this is why figuring out what layers and connectivity between them you need is difficult. Also, I suspect this is why simply ensuring right high-level wiring between parts of the brain and how to wire them to input/output channels might the most important part to encode in DNA, as the inner connections and weights can be later figured out relatively easily)
- Erik Jenner Dec 29, 2022, 10:26 PM
  8 points
  4
  Parent
  I’m very interested in examples of non-modular systems, but I’m not convinced by this one, for multiple reasons:
  - Even a 1,500 line function is a pretty small part of the entire codebase. So the existence of that function already means that the codebase as a whole seems somewhat modular.
  - My guess is that the function itself is in fact also modular (in the way I’d use the term). I only glanced at the function you link very quickly, but one thing that jumped out are the comments that divide it into “Phase 1” to “Phase 5″. So even though it’s not explicitly decomposed into submodules in the sense of e.g. helper functions, it does seem that programmers find it a useful abstraction to think of this huge function as a composition of five “submodules” that perform different subtasks. I would guess that this abstraction is reflected somehow in the structure of the function itself, as opposed to being completely arbitrary (i.e. putting the boundaries between phases at random line numbers instead would be worse in some pretty objective sense). So to me, the existence of 1500 line functions is not strong evidence against the ubiquity of modularity, since modularity properly defined should be more general than just functions. I do agree this would be a good counterexample to certain definitions of modularity that are too narrow. (To be clear, I don’t think anyone has a good definition yet for how this “natural submodule structure” could be read off from a program.)
  - Regarding the if statements: arguably, “truly non-modular” code would have lots of if statements that use a big fraction or even almost all of the variables in scope (or if we’re being strict, in the entire program, since smaller scopes already imply submodules). So I think if an if statement in a function with hundreds of variables contains just 4-8 terms depending on how we count, that’s not a lot.
  I wouldn’t be surprised to learn we just have somewhat different notions of what “modularity” should mean. For sufficiently narrow definitions, I agree that lots of computations are non-modular, I’m just less interested in those.