Logan Riggs comments on When AI 10x’s AI R&D, What Do We Do?

Logan Riggs 22 Dec 2024 11:48 UTC
5 points
0
Thanks!
I forgot about faithful CoT and definitely think that should be a “Step 0”. I’m also concerned here that AGI labs just don’t do the reasonable things (ie training for briefness making the CoT more steganographic).
For Mech-interp, ya, we’re currently bottlenecked by:
1. Finding a good enough unit-of-computation (which would enable most of the higher-guarantee research)
2. Computing Attention_in--> Attention_out (which Keith got the QK-circuit → Attention pattern working a while ago, but haven’t hooked up w/ the OV-circuit)