evhub comments on A positive case for how we might succeed at prosaic AI alignment

evhub 16 Nov 2021 8:11 UTC
LW: 5 AF: 5
AF
I suspect that there were a lot of approaches that would have produced similar results to how we ended up doing language modeling. I believe that the main advantage of Transformers over LSTMs is just that LSTMs have exponentially decaying ability to pay attention to prior tokens while Transformers can pay constant attention to all tokens in the context. I suspect that it would have been possible to fix the exponential decay problem with LSTMs and get them to scale like Transformers, but Transformers came first, so nobody tried. And that’s not to say that ML as a field is incompetent or anything—it’s just why would you try when you already have Transformers.

Also, note that “best results” for powerful AI systems is going to include alignment—alignment is a pretty important component of best results for any actual practical application that the big labs care about that isn’t just “scores the highest on some benchmark.”
- Quintin Pope 16 Nov 2021 9:06 UTC
  LW: 5 AF: 1
  AF Parent
  I agree that transformers vs other architectures is a better example of the field “following the leader” because there are lots of other strong architectures (perceiver, mlp mixer, etc). In comparison, using self supervised transfer learning is just an objectively good idea you can apply to any architecture and one the brain itself almost surely uses. The field would have converged to doing so regardless of the dominant architecture.
  
  One hopeful sign is how little attention the ConvBERT language model has gotten. It mixes some convolution operations with self attention to allow self attention heads to focus on global patterns as opposed to local patterns better handled by convolution. ConvBERT is more compute efficient than a standard transformer, but hasn’t made much of a splash. It shows the field can ignore low profile advances made by smaller labs.
  
  For your point about the value of alignment: I think there’s a pretty big range of capabilities where the marginal return on extra capabilities is higher than the marginal return on extra alignment. Also, you seem focused on avoiding deception/treacherous turns, which I think are a small part of alignment costs until near human capabilities.
  
  I don’t know what sort of capabilities penalty you pay for using a myopic training objective, but I don’t think there’s much margin available before voluntary mass adoption becomes implausible.