Interpretability progress, if it is to be useful for alignment, is not primarily bottlenecked on highly legible problems right now. So I expect the problems in the post to apply in full, at least for now.
Interpretability progress, if it is to be useful for alignment, is not primarily bottlenecked on highly legible problems right now. So I expect the problems in the post to apply in full, at least for now.