@Zvi I’m curious if you have thoughts on Buck’s post here (and my comment here) about how empirical evidence of scheming might not cause people to update toward thinking scheming is a legitimate/scary threat model (unless they already had priors or theoretical context that made them concerned about this.)
Do you think this is true? And if so, what implications do you think this has for people who are trying to help the world better understand misalignment/scheming?
I found the description of warning fatigue interesting. Do you have takes on the warning fatigue concern?