M. Y. Zuo comments on Alignment Faking in Large Language Models

M. Y. Zuo 19 Dec 2024 15:52 UTC
−17 points
−21
Why do a lot of the “External reviews of “Alignment faking in large language models”” read like they were also written, or edited, by LLMs?

Are people expected to take “reviews” done seemingly pro forma at face value?