lewis smith comments on Circuits in Superposition: Compressing many small neural networks into one

lewis smith 15 Oct 2024 13:51 UTC
1 point
0
with later small networks taking the outputs of earlier small networks as their inputs.
what’s the distinction between two small networks connected in series with the first taking the output of the previous one as input and one big network? what defines the boundaries of the networks here?
- jake_mendel 15 Oct 2024 14:05 UTC
  3 points
  0
  Parent
  I’m not sure I understand your question, but are you asking ‘in what sense are there two networks in series rather than just one deeper network’? The answer to that would be: parts of the inputs to a later small network could come from the outputs of many earlier small networks. Provided the later subnetwork is still sparsely used, it could have a different distribution of when it is used to any particular earlier subnetwork. A classic simple example is how the left-orientation dog detector and the right-orientation dog detector in InceptionV1 fire sort of independently, but both their outputs are inputs to the any-orientation dog detector (which in this case is just computing an OR).
  - lewis smith 15 Oct 2024 15:01 UTC
    1 point
    0
    Parent
    yeah that makes sense I think