PeterMcCluskey comments on The case for removing alignment and ML research from the training dataset

PeterMcCluskey 31 May 2023 23:13 UTC
2 points
0
Filtering out entire sites seems too broad and too crude to have much benefit.

I see plenty of room to turn this into a somewhat good proposal by having GPT-4 look through the dataset for a narrow set of topics. Something close to “how we will test AIs for deception”.
- beren 1 Jun 2023 15:35 UTC
  2 points
  0
  Parent
  Yes, I think what I proposed here is the broadest and crudest thing that will work. It can of course be much more targeted to specific proposals or posts that we think are potentially most dangerous. Using existing language models to rank these is an interesting idea.