jacquesthibs comments on On the Importance of Open Sourcing Reward Models

jacquesthibs 2 Jan 2023 22:16 UTC
2 points
0
I don’t have too much information, but CarperAI is planning to be open-sourcing GPT-J models fine-tuned via RLHF for a few different tasks (e.g. summarization). I think people should definitely do interpretability work on these models as soon as they are released. I think it would be great to compare interpretability results from the RLHF models to self-supervised models.