Rohin Shah comments on Takes on “Alignment Faking in Large Language Models”