rpglover64 comments on Open & Welcome Thread—January 2023

rpglover64 10 Feb 2023 0:38 UTC
3 points
2
(I may promote this to a full question)

Do we actually know what’s happening when you take an LLM trained on token prediction and fine-tune is via e.g. RLHF to get something like InstructGPT or ChatGPT? The more I think about the phenomenon, the more confused I feel.
What links here?
- What’s actually going on in the “mind” of the model when we fine-tune GPT-3 to InstructGPT? by rpglover64 (10 Feb 2023 7:57 UTC; 18 points)
- cubefox 10 Feb 2023 2:38 UTC
  4 points
  0
  Parent
  Here is a short overview: https://openai.com/blog/instruction-following/
  What links here?
  - What’s actually going on in the “mind” of the model when we fine-tune GPT-3 to InstructGPT? by rpglover64 (10 Feb 2023 7:57 UTC; 18 points)
- DragonGod 10 Feb 2023 2:22 UTC
  3 points
  0
  Parent
  Do please promote to a full question; I also want to know the answer.
  - rpglover64 10 Feb 2023 13:08 UTC
    3 points
    0
    Parent
    Done: https://www.lesswrong.com/posts/eywpzHRgXTCCAi8yt/what-s-actually-going-on-in-the-mind-of-the-model-when-we
    - DragonGod 10 Feb 2023 14:20 UTC
      2 points
      0
      Parent
      Upvoted.