Gerald Monroe comments on Jimrandomh’s Shortform

Gerald Monroe 15 Mar 2023 13:01 UTC
2 points
If you look at the GPT-4 paper they used the model itself to check it’s own outputs for negative content. This lets them scale applying the constraints of “don’t say <things that violate the rules>”.

Presumably they used an unaltered copy of GPT-4 as the “grader”. So it’s not quite RSI because of this—it’s not recursive, but it is self improvement.
This to me is kinda major, AI is now capable enough to make fuzzy assessments of if a piece of text is correct or breaks rules.
For other reasons, especially their strong visual processing, yeah, self improvement in a general sense appears possible. (self improvement as a ‘shorthand’, your pipeline for doing it might use immutable unaltered models for portions of it)