I agree i.e. I also (fairly weakly) disagree with the value of thinking of ‘distilling’ as a separate thing. Part of me wants to conjecture that it’s comes from thinking of alignment work predominantly as mathematics or a hard science in which the standard ‘unit’ is a an original theorem or original result which might be poorly written up but can’t really be argued against much. But if we think of the area (I’m thinking predominantly about more conceptual/theoretical alignment) as a ‘softer’, messier, ongoing discourse full of different arguments from different viewpoints and under different assumptions, with counter-arguments, rejoinders, clarifications, retractions etc. that takes place across blogs, papers, talks, theorems, experiments etc that all somehow slowly works to produce progress, then it starts to be less clear what this special activity called ‘distilling’ really is.
Another relevant point, but one which I won’t bother trying to expand on much here, is that a research community assimilating—and then eventually building on—complex ideas can take a really long time.
[At risk of extending into a rant, I also just think the term is a bit off-putting. Sure, I can get the sense of what it means from the word and the way it is used—it’s not completely opaque or anything—but I’d not heard it used regularly in this way until I started looking at the alignment forum. What’s really so special about alignment that we need to use this word? Do we think we have figured out some new secret activity that is useful for intellectual progress that other fields haven’t figured out? Can we not get by using words like “writing” and “teaching” and “explaining”?]
I agree i.e. I also (fairly weakly) disagree with the value of thinking of ‘distilling’ as a separate thing. Part of me wants to conjecture that it’s comes from thinking of alignment work predominantly as mathematics or a hard science in which the standard ‘unit’ is a an original theorem or original result which might be poorly written up but can’t really be argued against much. But if we think of the area (I’m thinking predominantly about more conceptual/theoretical alignment) as a ‘softer’, messier, ongoing discourse full of different arguments from different viewpoints and under different assumptions, with counter-arguments, rejoinders, clarifications, retractions etc. that takes place across blogs, papers, talks, theorems, experiments etc that all somehow slowly works to produce progress, then it starts to be less clear what this special activity called ‘distilling’ really is.
Another relevant point, but one which I won’t bother trying to expand on much here, is that a research community assimilating—and then eventually building on—complex ideas can take a really long time.
[At risk of extending into a rant, I also just think the term is a bit off-putting. Sure, I can get the sense of what it means from the word and the way it is used—it’s not completely opaque or anything—but I’d not heard it used regularly in this way until I started looking at the alignment forum. What’s really so special about alignment that we need to use this word? Do we think we have figured out some new secret activity that is useful for intellectual progress that other fields haven’t figured out? Can we not get by using words like “writing” and “teaching” and “explaining”?]