gwern comments on Finding Deception in Language Models