gwern comments on Transformer language models are doing something more general