I am optimistic that further thinking on automation prospects could identify other automation-tractable areas of alignment and control (e.g. see here for previous work).
This tag might be helpful: https://www.lesswrong.com/w/ai-assisted-alignment
Here’s a recent shortform on the topic: https://www.lesswrong.com/posts/mKgbawbJBxEmQaLSJ/davekasten-s-shortform?commentId=32jReMrHDd5vkDBwt
I wonder about getting an LLM to process LW archive posts, and tag posts which contain alignment ideas that seem automatable.
This tag might be helpful: https://www.lesswrong.com/w/ai-assisted-alignment
Here’s a recent shortform on the topic: https://www.lesswrong.com/posts/mKgbawbJBxEmQaLSJ/davekasten-s-shortform?commentId=32jReMrHDd5vkDBwt
I wonder about getting an LLM to process LW archive posts, and tag posts which contain alignment ideas that seem automatable.