At a glance, I couldn’t find any significant capability externality, but I think that all interpretability work should, as a standard, have a paragraph explaining why the authors won’t think their work will be used to improve AI systems in an unsafe manner.
At a glance, I couldn’t find any significant capability externality, but I think that all interpretability work should, as a standard, have a paragraph explaining why the authors won’t think their work will be used to improve AI systems in an unsafe manner.
Whisper seems sufficiently far from the systems pushing the capability frontier (GPT-4 and co) that I really don’t feel concerned about that here