Quick take: I think LM agents to automate large chunks of prosaic alignment research should probably become the main focus of AI safety funding / person-time. I can’t think of any better spent marginal funding / effort at this time.
Quick take: I think LM agents to automate large chunks of prosaic alignment research should probably become the main focus of AI safety funding / person-time. I can’t think of any better spent marginal funding / effort at this time.