Thanks for writing this post. I agree with the sentiment but feel it important to highlight that it is inevitable that people assume you have good strategy takes.
In Monty Python’s “Life of Brian” there is a scene in which the titular character finds himself surrounded by a mob of people declaring him the Mesiah. Brian rejects this label and flees into the desert, only to find himself standing in a shallow hole, surrounded by adherents. They declare that his reluctance to accept the title is further evidence that he really is the Mesiah.
To my knowledge nobody thinks that you are the literal Messiah but plenty of people going into AI Safety are heavily influenced by your research agenda. You work at Deepmind and have mentored a sizeable number of new researchers through MATS. 80,000 Hours lists you as example of someone with a successful career in Technical Alignment research.
To some, the fact that you request people not to blindly trust your strategic judgement is evidence that you are humble, grounded and pragmatic, all good reasons to trust your strategic judgement.
It is inevitable that people will view your views on the Theory of Change for Interpretability as aithoritative. You could literally repeat this post verbatim at the end of every single AI safety/interpretability talk you give, and some portion of junior researchers will still leave the talk defering to your strategic judgement.
While each mind might have a maximum abstraction height, I am not convinced that the inability of people to deal with increasingly complex topics is direct evidence of this.
Is it that this topic is impossible for their mind to comprehend, or is it that they’ve simple failed to learn it in the finite time period they were given?