Agree with this, and wanted to add that I am also not completely sure if mechanistic interpretability is a good “commercial bet” yet based on my experience and understanding, with my definition of commercial bet being materialization of revenue or simply revenue generating.
One revenue generating path I can see for LLMs is the company uses them to identify data that are most effective for particular benchmarks, but my current understanding (correct me if I am wrong) is that it is relatively costly to first research a reliable method, and then run interpretability methods for large models for now; additionally, it would be generally very intuitive to researchers on what datasets could be useful to specific benchmarks already. On the other hand, the method would be much useful to look into nuanced and hard to tackle safety problems. In fact there are a lot of previous efforts in using interpretability generally for safety mitigations.
Agree with this, and wanted to add that I am also not completely sure if mechanistic interpretability is a good “commercial bet” yet based on my experience and understanding, with my definition of commercial bet being materialization of revenue or simply revenue generating.
One revenue generating path I can see for LLMs is the company uses them to identify data that are most effective for particular benchmarks, but my current understanding (correct me if I am wrong) is that it is relatively costly to first research a reliable method, and then run interpretability methods for large models for now; additionally, it would be generally very intuitive to researchers on what datasets could be useful to specific benchmarks already. On the other hand, the method would be much useful to look into nuanced and hard to tackle safety problems. In fact there are a lot of previous efforts in using interpretability generally for safety mitigations.