I disagree about ELK being useless/yak shaving/too narrow. I think it’s pointing at something important and far more general than the exposition might lead one to believe. In particular, some observations:
The causal diagram stuff is just for concreteness and not actually a core part of the idea
Much of the length is laying out a series of naive strategies and explaining why they don’t work. The actual “system” isn’t really that elaborate
ELK is in large part about trying to overcome the limitations of IDA/Debate/etc
ELK provides an upper bound on how powerful interpretability can possibly be in theory
I disagree about ELK being useless/yak shaving/too narrow. I think it’s pointing at something important and far more general than the exposition might lead one to believe. In particular, some observations:
The causal diagram stuff is just for concreteness and not actually a core part of the idea
Much of the length is laying out a series of naive strategies and explaining why they don’t work. The actual “system” isn’t really that elaborate
ELK is in large part about trying to overcome the limitations of IDA/Debate/etc
ELK provides an upper bound on how powerful interpretability can possibly be in theory