What do you (or others) think is the most promising, soon-possible way to use language models to help with alignment? A couple of possible ideas:
Using LMs to help with alignment theory (e.g., alignment forum posts, ELK proposals, etc.)
Using LMs to run experiments (e.g., writing code, launching experiments, analyzing experiments, and repeat)
Using LMs as research assistants (what Ought is doing with Elicit)
Something else?
What do you (or others) think is the most promising, soon-possible way to use language models to help with alignment? A couple of possible ideas:
Using LMs to help with alignment theory (e.g., alignment forum posts, ELK proposals, etc.)
Using LMs to run experiments (e.g., writing code, launching experiments, analyzing experiments, and repeat)
Using LMs as research assistants (what Ought is doing with Elicit)
Something else?