Nicholas / Heather Kross comments on Why I’m Not (Yet) A Full-Time Technical Alignment Researcher

Nicholas / Heather Kross 25 May 2023 4:28 UTC
3 points
3
Counterpoint: at least one kind of research, mechanistic interpretability, could very well be both dangerous by helping capabilities and also essential for alignment. My current intuition is that the same could be said of other research avenues.

Yes, there are plenty of dangerous ideas that aren’t so coupled with alignment, but they’re not the frustrating edge-case I’m writing about. (And, of course, I’m not doing or publishing that type of research.)
- Christopher King 25 May 2023 13:13 UTC
  3 points
  2
  Parent
  Right, and that article makes the case that in those cases you should publish. The reasoning is that the value of unpublished research decays rapidly, so if it could help alignment, publish before it loses its value.
  - Nicholas / Heather Kross 25 May 2023 23:04 UTC
    3 points
    0
    Parent
    Good catch, that certainly motivates me even more to finish my current writings!
    - Christopher King 26 May 2023 16:01 UTC
      3 points
      2
      Parent
      Yeah exactly! Not telling anyone until the end just means you missed the chance to push society towards alignment and build on your work. Don’t wait!