neverix comments on SAE features for refusal and sycophancy steering vectors