To clarify: are you saying that since you perceive Chris Olah as mostly intrinsically caring about understanding neural networks (instead of mostly caring about alignment), you conclude that his work is irrelevant to alignment?
No, I have detailed inside view models of the alignment problem, and under those models consider Chris Olah’s work to be interesting but close to irrelevant (or to be about as relevant as the work of top capability researchers, whose work, to be clear, does have some relevance since of course understanding how to make systems better is relevant for understanding how AGI will behave, but where the relevance is pretty limited).
To clarify: are you saying that since you perceive Chris Olah as mostly intrinsically caring about understanding neural networks (instead of mostly caring about alignment), you conclude that his work is irrelevant to alignment?
No, I have detailed inside view models of the alignment problem, and under those models consider Chris Olah’s work to be interesting but close to irrelevant (or to be about as relevant as the work of top capability researchers, whose work, to be clear, does have some relevance since of course understanding how to make systems better is relevant for understanding how AGI will behave, but where the relevance is pretty limited).