Ethan Perez comments on [Link] A minimal viable product for alignment

Ethan Perez 9 Apr 2022 21:46 UTC
LW: 17 AF: 10
AF
What do you (or others) think is the most promising, soon-possible way to use language models to help with alignment? A couple of possible ideas:
1. Using LMs to help with alignment theory (e.g., alignment forum posts, ELK proposals, etc.)
2. Using LMs to run experiments (e.g., writing code, launching experiments, analyzing experiments, and repeat)
3. Using LMs as research assistants (what Ought is doing with Elicit)
4. Something else?