aogara comments on All AGI Safety questions welcome (especially basic ones) [April 2023]

aogara 17 Apr 2023 8:37 UTC
3 points
0
Some links I think do a good job:

https://huggingface.co/blog/rlhf

https://openai.com/research/instruction-following
- tgb 17 Apr 2023 12:26 UTC
  3 points
  0
  Parent
  Thank you. I was completely missing that they used a second ‘preference’ model to score outputs for the RL. I’m surprised that works!