James Chua comments on A library for safety research in conditioning on RLHF tasks

James Chua 26 Feb 2023 16:52 UTC
2 points
0
For DTs its really just a linear function to convert the scalar reward into the same dimmensions the token embeddings.
So e.g. a single token’s embedding has a hidden state of size 1024 .
We can learn a linear function that takes this scalar and outputs something of size 1024.
The more annoying (PITA) part was offset the positional/attention masks/labels for this.