Would you be a better RLHF labeler than GPT-4?
Have you ever tried to label your own data? Have you ever tried to run an evaluation, yourself, on a system that you’ve built? It’s difficult. We, humans, are imperfect. We are terrible at labeling. We are also expensive, and difficult to scale.
Lets say you wanted to create a high quality instruct dataset, today. What would generate a better output? A high resource model? Or, a farm of hired humans?
How many human RLHF labels are incorrect?
How many benchmarks are our machines now superhuman at?
If we’re not bootstrapped, we will be soon.
See Alpaca, Constitutional AI, pre-training with preferences.