idk if this is The Reason or anything, but one factor might be that current image models use a heavily convolutional architecture and are as a result quite a bit weaker. transformers are involved, but not as heavily as in current language models.
You’re saying that transformers are key to alignment research?
I would imagine that latent space exploration and explanation is a useful part of interpretability, and developing techniques that work for both language and images improves the chance that the techniques will generalize to new neural architectures.
idk if this is The Reason or anything, but one factor might be that current image models use a heavily convolutional architecture and are as a result quite a bit weaker. transformers are involved, but not as heavily as in current language models.
You’re saying that transformers are key to alignment research?
I would imagine that latent space exploration and explanation is a useful part of interpretability, and developing techniques that work for both language and images improves the chance that the techniques will generalize to new neural architectures.