Oh hmm—that could be true. I suspect that data curation is too important though, there are significant gains to be had by not including confusing data as positive examples. [Loading paper links...]
significant gains to be had by not including confusing data
But things like pre-training with preferences should take care of that concern, no? Just mark good stuff with a magic good-stuff token, but allow the transformer to refine features for everything.
Oh hmm—that could be true. I suspect that data curation is too important though, there are significant gains to be had by not including confusing data as positive examples. [Loading paper links...]
But things like pre-training with preferences should take care of that concern, no? Just mark good stuff with a magic good-stuff token, but allow the transformer to refine features for everything.
Yeah could be. I’m going to abstain from any further claims, I only have so much hunch fluid here