After reading this, I went back and also re-read Gears in Understanding (https://www.lesswrong.com/posts/B7P97C27rvHPz3s9B/gears-in-understanding) which this is clearly working from. The key question to me was, is this a better explanation for some class of people? If so, it’s quite valuable, since gears are a vital concept. If not, then it has to introduce something new in a way that I don’t see here, or it’s not worth including.
It’s not easy to put myself in the mind of someone who doesn’t know about gears.
I think the original Gears in Understanding gives a better understanding of the central points, if you grok both posts fully, and gives better ways to get a sense of a given model’s gear-ness level. What this post does better is Be Simpler, which can be important, and to provide a simpler motivation for What Happens Without Gears. In particular, this simplified version seems like it would be easier to get someone up to speed using, to the point where they can go ‘wait a minute that doesn’t have any gears’ usefully.
My other worry this brought up is that this reflects a general trend, of moving towards things that stand better alone and are simpler to grok and easier to appreciate, at the cost of richness of detail and grounding in related concepts and such—that years ago we’d do more of the thing Gears in Understanding did, and now we do Gears vs. Behavior thing more, and gears are important enough that I don’t mind doing both (even if only to have a backup) but that there’s a slippery slope where the second thing drives out the first thing and you’re left pretty sad after a while.
Huh. This was very different from the role I originally imagined for this post.
My reading of Gears in Understanding was that it was trying to gesture at a concept, and give that concept a handle, without really explaining or “defining” the concept. It gave some heuristics for recognizing gears-level models in the wild (e.g. “could the model be correct but a given variable be different?” or “if the model were falsified, what else could we infer from that?”), but it never really said why those heuristics should all correlate with each other, or why/how they’re all pointing to the same thing, or why that thing would be useful/interesting. It’s just trying to point to a particular cluster in concept-space, without trying to explain why there’s a cluster there, or precisely delineate the cluster.
Gears vs Behavior is trying to (at least partially) explain why there’s a cluster there and draw a box around it. It’s presenting a frame in which the key, defining property which separates gears-level models from purely behavioral/black box models is that gearsy models can make predictions and leverage data from side-channels and out-of-distribution (which is itself a pseudo-side-channel). Heuristics like “if the model were falsified, what else could we infer from that?” follow from the ability to leverage side-channels. I claim that this is pointing to the same thing Val was trying to point to, and can be usefully viewed as the main defining property of gearsy models (as opposed to blackboxes).
This post comes across as simpler because it’s focused on the key defining property of the cluster, rather than trying to gesture at the concept from a bunch of different directions without really figuring out what the “defining” feature is.
At this point, I actually think a better way to “define” gears-level models is the dimensionality and conditional independence framework, although I still see Gears vs Behavior as a basically-correct “dual” to those frames, focusing on the space of queries that models address, rather than the structures of the models themselves. (In some sense, dimensionality and conditional independence give a gears-level model of gears-level models, while Gears vs Behavior gives a more blackboxy definition of gears-level models.)
After reading this, I went back and also re-read Gears in Understanding (https://www.lesswrong.com/posts/B7P97C27rvHPz3s9B/gears-in-understanding) which this is clearly working from. The key question to me was, is this a better explanation for some class of people? If so, it’s quite valuable, since gears are a vital concept. If not, then it has to introduce something new in a way that I don’t see here, or it’s not worth including.
It’s not easy to put myself in the mind of someone who doesn’t know about gears.
I think the original Gears in Understanding gives a better understanding of the central points, if you grok both posts fully, and gives better ways to get a sense of a given model’s gear-ness level. What this post does better is Be Simpler, which can be important, and to provide a simpler motivation for What Happens Without Gears. In particular, this simplified version seems like it would be easier to get someone up to speed using, to the point where they can go ‘wait a minute that doesn’t have any gears’ usefully.
My other worry this brought up is that this reflects a general trend, of moving towards things that stand better alone and are simpler to grok and easier to appreciate, at the cost of richness of detail and grounding in related concepts and such—that years ago we’d do more of the thing Gears in Understanding did, and now we do Gears vs. Behavior thing more, and gears are important enough that I don’t mind doing both (even if only to have a backup) but that there’s a slippery slope where the second thing drives out the first thing and you’re left pretty sad after a while.
Huh. This was very different from the role I originally imagined for this post.
My reading of Gears in Understanding was that it was trying to gesture at a concept, and give that concept a handle, without really explaining or “defining” the concept. It gave some heuristics for recognizing gears-level models in the wild (e.g. “could the model be correct but a given variable be different?” or “if the model were falsified, what else could we infer from that?”), but it never really said why those heuristics should all correlate with each other, or why/how they’re all pointing to the same thing, or why that thing would be useful/interesting. It’s just trying to point to a particular cluster in concept-space, without trying to explain why there’s a cluster there, or precisely delineate the cluster.
Gears vs Behavior is trying to (at least partially) explain why there’s a cluster there and draw a box around it. It’s presenting a frame in which the key, defining property which separates gears-level models from purely behavioral/black box models is that gearsy models can make predictions and leverage data from side-channels and out-of-distribution (which is itself a pseudo-side-channel). Heuristics like “if the model were falsified, what else could we infer from that?” follow from the ability to leverage side-channels. I claim that this is pointing to the same thing Val was trying to point to, and can be usefully viewed as the main defining property of gearsy models (as opposed to blackboxes).
This post comes across as simpler because it’s focused on the key defining property of the cluster, rather than trying to gesture at the concept from a bunch of different directions without really figuring out what the “defining” feature is.
At this point, I actually think a better way to “define” gears-level models is the dimensionality and conditional independence framework, although I still see Gears vs Behavior as a basically-correct “dual” to those frames, focusing on the space of queries that models address, rather than the structures of the models themselves. (In some sense, dimensionality and conditional independence give a gears-level model of gears-level models, while Gears vs Behavior gives a more blackboxy definition of gears-level models.)