I’ve heard people argue that we should develop mechanistic interpretability methods that can be applied to any architecture.
I think the usual reason this claim is made is because the person making the claim thinks it’s very plausible LLMs aren’t the paradigm that lead to AGI. If that’s the case, then interpretability that’s indexed heavily on them gets us understanding of something qualitatively weaker than we’d like. I agree that there’ll be some transfer, but it seems better and not-very-hard to talk about how well different kinds of work transfer.
Many deals in the real world have a lot of positive surplus. Most deals I would like to make have positive surplus. I would still make a deal to get something more valuable with something less valuable, but if the margins are very thin (or approaching zero), then I wouldn’t like the deal even as I make it. I can feel like it’s a terrible deal because the deals I want would have a lot more surplus to them, ideally involving a less painful cost.