beren comments on The case for removing alignment and ML research from the training dataset

beren 1 Jun 2023 15:35 UTC
2 points
0
Yes, I think what I proposed here is the broadest and crudest thing that will work. It can of course be much more targeted to specific proposals or posts that we think are potentially most dangerous. Using existing language models to rank these is an interesting idea.