I have not updated on these results much so far. Though I haven’t looked at them in detail yet. My guess is that if you already had a view of SAE-style interpretability somewhat similar to mine [1,2], these papers shouldn’t be much of an additional update for you.
I have not updated on these results much so far. Though I haven’t looked at them in detail yet. My guess is that if you already had a view of SAE-style interpretability somewhat similar to mine [1,2], these papers shouldn’t be much of an additional update for you.