Jacob_Hilton comments on How much alignment data will we need in the long run?

Jacob_Hilton 11 Aug 2022 4:29 UTC
LW: 2 AF: 1
0
AF
I share your intuitions about ultimately not needing much alignment data (and tried to get that across in the post), but quantitatively:
- Recent implementations of RLHF have used on the order of thousands of hours of human feedback, so 2 orders of magnitude more than that is much more than a few hundred hours of human feedback.
- I think it’s pretty likely that we’ll be able to pay an alignment tax upwards of 1% of total training costs (essentially because people don’t want to die), in which case we could afford to spend significantly more than an additional 2 orders of magnitude on alignment data, if that did in fact turn out to be required.