Daniel Tan comments on Creating Interpretable Latent Spaces with Gradient Routing

Daniel Tan 14 Dec 2024 22:41 UTC
2 points
−1
I agree, but the point I’m making is that you had to know the labels in order to know where to detach the gradient. So it’s kind of like making something interpretable by imposing your interpretation on it, which I feel is tautological

For the record I’m excited by gradient routing, and I don’t want to come across as a downer, but this application doesn’t compel me

Edit: Here’s an intuition pump. Would you be similarly excited by having 10 different autoencoders which each reconstruct a single digit, then stitching them together into a single global autoencoder? Because conceptually that seems like what you’re doing
- Jacob G-W 14 Dec 2024 23:23 UTC
  1 point
  0
  Parent
  I disagree that this is the same as just stitching together different autoencoders. Presumably the encoder has some shared computation before specializing at the encoding level. I also don’t see how you could use 10 different autoencoders to classify an image from the encodings. I guess you could just look at the reconstruction loss and then the autoencoder which got the lowest loss would probably correspond to the label, but that seems different to what I’m doing. However, I agree that this application is not useful. I shared it because I (and others) thought it was cool. It’s not really practical at all. Hope this addresses your question :)
  - Daniel Tan 14 Dec 2024 23:40 UTC
    1 point
    0
    Parent
    I see, if you disagree w the characterization then I’ve likely misunderstood what you were doing in this post, in which case I no longer endorse the above statements. Thanks for clarifying!