It seems like the post is implicitly referring to the next big paper on SAEs from one of these labs, similar in newsworthiness as the last Anthropic paper. A big paper won’t be a negative result or a much smaller downstream application, and a big paper would compare its method against baselines if possible, making 165% still within the ballpark.
I still agree with your comment, especially the recommendation for a time-based prediction (I explained in my other comment here).
It seems like the post is implicitly referring to the next big paper on SAEs from one of these labs, similar in newsworthiness as the last Anthropic paper. A big paper won’t be a negative result or a much smaller downstream application, and a big paper would compare its method against baselines if possible, making 165% still within the ballpark.
I still agree with your comment, especially the recommendation for a time-based prediction (I explained in my other comment here).
Thank you for your alignment work :)