Vladimir_Nesov comments on Are there specific books that it might slightly help alignment to have on the internet?

Vladimir_Nesov 29 Mar 2023 6:21 UTC
4 points
0

significant gains to be had by not including confusing data

But things like pre-training with preferences should take care of that concern, no? Just mark good stuff with a magic good-stuff token, but allow the transformer to refine features for everything.
- the gears to ascension 29 Mar 2023 6:44 UTC
  2 points
  0
  Parent
  Yeah could be. I’m going to abstain from any further claims, I only have so much hunch fluid here