evhub comments on Recommend HAIST resources for assessing the value of RLHF-related alignment research

evhub 5 Nov 2022 21:08 UTC
7 points
2
I’d recommend the “AI safety via conditioning predictive models” doc my coauthors and I are working on right now—it’s not quite ready to be published publicly yet, but we have a full draft that we’re looking for comments on right now. Messaged to both of you privately; feel free to share with other HAIST members.
- Jozdien 6 Nov 2022 7:47 UTC
  1 point
  0
  Parent
  Could I get a link to this as well?
  - Jérémy Scheurer 6 Nov 2022 19:05 UTC
    1 point
    0
    Parent
    Would also love to have a look.