I think I saw OpenAI do some “rubric” thing which resembles the Anthropic method. It seems easy enough for me to imagine that they’d do a worse job of it though, or did something somewhat worse, since folks at Anthropic seem to be the originators of the idea (and are more likely to have a bunch of inside view stuff that helped them apply it well)
I think I saw OpenAI do some “rubric” thing which resembles the Anthropic method. It seems easy enough for me to imagine that they’d do a worse job of it though, or did something somewhat worse, since folks at Anthropic seem to be the originators of the idea (and are more likely to have a bunch of inside view stuff that helped them apply it well)