RogerDearnaley answers Suggestions for net positive LLM research

RogerDearnaley 14 Dec 2023 0:16 UTC
2 points
0
If I were in your position, I would work on the ideas described in my post How to Control an LLM’s Behavior and the paper Pretraining Language Models with Human Preferences that inspired it.
From the paper’s results, the approach is very effective, my post discusses how to make it very controllable and flexible, and it has the particular advantage that since it’s done at pretraining time it can’t just be easily fine-tuned away out of an open-source model (admittedly, the latter might do more for your employability at Meta FAIR Paris or Mistral than at DeepMind — but then, which of those seem like the higher x-risk to solve?)
- Cole Wyeth 7 Jan 2024 21:52 UTC
  1 point
  0
  Parent
  I like this idea, can I DM you about the research frontier?
  - RogerDearnaley 8 Jan 2024 9:02 UTC
    1 point
    0
    Parent
    Of course. I also wrote a second post on another possible specific application of this approach: Language Model Memorization, Copyright Law, and Conditional Pretraining Alignment.