Gytis Daujotas comments on Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders

Gytis Daujotas 6 Aug 2024 0:38 UTC
1 point
0
Great question that I wish I had an answer to! I haven’t yet played around with GANs so not entirely sure. Do you have any intuition about what one would expect to see?
- gwern 22 Aug 2024 0:19 UTC
  5 points
  1
  Parent
  Well, SAEs are the hot new thing I don’t know much about, so I was hoping you’d know how they compare to the dense z latents of GANs. (This is not as historical or idle a question as it may seem, because GANs are enjoying a bit of a revival as diffusion people admit that actually, having true latent spaces and being able to generate images in a single forward pass are both kinda useful and maybe I had a point after all.)
  
  GAN z are so useful because they are just a multivariate normal (or, in fact, any distribution you want to sample from—you can use Bernouilli, exponential, Poisson, and they’ll even work better, according to BigGAN, probably because they can be mapped onto features which are inherently binary or otherwise non-normal distributed, so you avoid the pathological parts of the z where the model is desperately trying to generate a face which has half of a pair of glasses). You can reverse an image pixel-identical, interpret each variable of z meaningfully, systematically sample ‘around’ points or in trajectories or just avoid too much overlap, edit them with sliders, and so on. Diffusion models and SAEs seem to lack most of that, and the equivalents are ham-handed and imprecise and expensive, compared to a free z tweak and a single forward pass.
  
  They don’t seem to work too well with really skewed distributions of features, particularly rare binary features. You usually make a dense z of 64–512 variables, so while the GAN can represent rare binary features, it can’t be done cleanly as a single variable (not even a binomial set to a very low p) without ‘using up’ the embedding. They have to be represented as complex nonlinear interactions of potentially multiple variables. Maybe not a big deal when you’re using another nonlinear model like random forests to figure out how to control the z but it hampers interpretability & control. And if you make z bigger and bigger, it’s unclear how well the GAN will perform in terms of making the latent space useful; the usual way to plug the random seed in is through a dense fully-connected layer, so that’s not going to scale too well.
  
  (Also, while GANs are enjoying a revival, the question of ‘sequence GAN’ or ‘language GAN’ admittedly remains unsolved: we do not have, and I am aware of no meaningful prospects, for a ‘LLM GAN’ which is anywhere near SOTA.)
  
  But I think there’s some potential here for crossover, especially as in some ways they seem to be opposites of each other: SAEs seem to be very expensive to train. Could they be bootstrapped from a pre-existing GAN, which presumably captures the desired features, often already represented linearly and disentangled, and speed up training a lot? Or could one encode a large dataset into z and then SAE those embeddings instead of internal activations? Can GANs expand their z during training, like progressively adding in new entries to z like binomials with ever lower p probabilities (inspired by nonparametric processes) to capture ever rarer features in a clean way? Can SAE training techniques make large z feasible? Since you can change the z arbitrarily to any random vector you want or even swap in/out adapters for the Generator to draw from totally different sources (it doesn’t even affect the Discriminator), can we feed SAEs directly into GAN Gs? And so on.