At the moment the A.I. world is dominated by an almost magical believe in large language models. Yes, they are marvelous, a very powerful technology. By all means, let’s understand and develop them. But they aren’t the way, the truth and the light. They’re just a very powerful and important technology. Heavy investment in them has an opportunity cost, less money to invest in other architectures and ideas.
And I’m not just talking about software, chips, and infrastructure. I’m talking about education and training. It’s not good to have a whole cohort of researchers and practitioners who know little or nothing beyond the current orthodoxy about machine learning and LLMs. That kind of mistake is very difficult to correct in the future. Why? Because correcting it means education and training. Who’s going to do it if no one knows anything else?
Moreover, in order to exploit LLMs effectively we need to understand how they work. Mechanistic interpretability is one approach. But: We’re not doing enough of it. And by itself it won’t do the job. People need to know more about language, linguistics, and cognition in order to understand what those models are doing.
YES.
At the moment the A.I. world is dominated by an almost magical believe in large language models. Yes, they are marvelous, a very powerful technology. By all means, let’s understand and develop them. But they aren’t the way, the truth and the light. They’re just a very powerful and important technology. Heavy investment in them has an opportunity cost, less money to invest in other architectures and ideas.
And I’m not just talking about software, chips, and infrastructure. I’m talking about education and training. It’s not good to have a whole cohort of researchers and practitioners who know little or nothing beyond the current orthodoxy about machine learning and LLMs. That kind of mistake is very difficult to correct in the future. Why? Because correcting it means education and training. Who’s going to do it if no one knows anything else?
Moreover, in order to exploit LLMs effectively we need to understand how they work. Mechanistic interpretability is one approach. But: We’re not doing enough of it. And by itself it won’t do the job. People need to know more about language, linguistics, and cognition in order to understand what those models are doing.