Something ~ like ‘make it legit’ has been and possibly will continue to be a personal interest of mine.
I’m posting this after Rohin entered this discussion—so Rohin, I hope you don’t mind me quoting you like this, but fwiw I was significantly influenced by this comment on Buck’s old talk transcript ‘My personal cruxes for working on AI safety’. (Rohin’s comment repeated here in full and please bear in mind this is 3 years old; his views I’m sure have developed and potentially moved a lot since then:)
I enjoyed this post, it was good to see this all laid out in a single essay, rather than floating around as a bunch of separate ideas.
That said, my personal cruxes and story of impact are actually fairly different: in particular, while this post sees the impact of research as coming from solving the technical alignment problem, I care about other sources of impact as well, including:
1. Field building: Research done now can help train people who will be able to analyze problems and find solutions in the future, when we have more evidence about what powerful AI systems will look like.
2. Credibility building: It does you no good to know how to align AI systems if the people who build AI systems don’t use your solutions. Research done now helps establish the AI safety field as the people to talk to in order to keep advanced AI systems safe.
3. Influencing AI strategy: This is a catch all category meant to include the ways that technical research influences the probability that we deploy unsafe AI systems in the future. For example, if technical research provides more clarity on exactly which systems are risky and which ones are fine, it becomes less likely that people build the risky systems (nobody _wants_ an unsafe AI system), even though this research doesn’t solve the alignment problem.
As a result, cruxes 3-5 in this post would not actually be cruxes for me (though 1 and 2 would be).
Something ~ like ‘make it legit’ has been and possibly will continue to be a personal interest of mine.
I’m posting this after Rohin entered this discussion—so Rohin, I hope you don’t mind me quoting you like this, but fwiw I was significantly influenced by this comment on Buck’s old talk transcript ‘My personal cruxes for working on AI safety’. (Rohin’s comment repeated here in full and please bear in mind this is 3 years old; his views I’m sure have developed and potentially moved a lot since then:)
I still endorse that comment, though I’ll note that it argues for the much weaker claims of
I would not stop working on alignment research if it turned out I wasn’t solving the technical alignment problem
There are useful impacts of alignment research other than solving the technical alignment problem
(As opposed to something more like “the main thing you should work on is ‘make alignment legit’”.)
(Also I’m glad to hear my comments are useful (or at least influential), thanks for letting me know!)