[Question] Image generation and alignment

rpglover645 Jan 2023 16:05 UTC

3 points

There isn’t a lot of talk about image models (e.g. Dall-E and StableDiffusion) on LW in the context of alignment, especially compared to LLMs. Why is that? Some hypotheses:

LLMs just happened to get some traction early, and due to network effects, they are the primary research vehicle
LLMs are a larger alignment risk than image models, e.g. the only alignment risk of image generation comes from the language embedding
LLMs are not a larger alignment risk, but they are easier to use for alignment research

rpglover645 Jan 2023 16:05 UTC

3 points

3 comments1 min readLW link

Ilio 7 Jan 2023 19:13 UTC
1 point
0
Following Scott Aaronson, we might say the answer depend on wether we’re talking reform|orthodox vision of alignement. Adversarial pictures and racial bias are definitely real concerns for automatic vision, then for reform alignement. But many animal species mastered vision, movement, or olfaction better than humans as a species, for hundred of millions years without producing anything that could challenge the competitive advantage of the human language, so I guess for orthodox alignement vision looks much less scary than language model.

I’m curious if those at ease with either orthodox or reform label would corroborate these predictions of their feelings?

the gears to ascension 5 Jan 2023 20:59 UTC
2 points
−1
idk if this is The Reason or anything, but one factor might be that current image models use a heavily convolutional architecture and are as a result quite a bit weaker. transformers are involved, but not as heavily as in current language models.
- rpglover64 6 Jan 2023 1:33 UTC
  1 point
  0
  Parent
  You’re saying that transformers are key to alignment research?
  I would imagine that latent space exploration and explanation is a useful part of interpretability, and developing techniques that work for both language and images improves the chance that the techniques will generalize to new neural architectures.