One of the paths which has non-zero hope in my mind is building a weakly aligned non-self improving research assistant for alignment researchers. Ought and EleutherAI’s #accelerating-alignment are the two places I know who are working in this direction fairly directly, though the various language model alignment orgs might also contribute usefully to the project.
One of the paths which has non-zero hope in my mind is building a weakly aligned non-self improving research assistant for alignment researchers. Ought and EleutherAI’s #accelerating-alignment are the two places I know who are working in this direction fairly directly, though the various language model alignment orgs might also contribute usefully to the project.