Nina Panickssery comments on Modulating sycophancy in an RLHF model via activation steering