“We also are using it to assist humans in evaluating AI outputs, starting the second phase in our alignment strategy.”
Probably something along the lines of RLAIF? Anthropic’s Claude might be more robustly tuned because of this, though GPT-4 might already have similar things as part of its own training.
Probably something along the lines of RLAIF? Anthropic’s Claude might be more robustly tuned because of this, though GPT-4 might already have similar things as part of its own training.