Oleg Trott comments on Does VETLM solve AI superalignment?

Oleg Trott 11 Aug 2024 6:58 UTC
1 point
0
I suspect this labeling and using the labels is still harder that you think though, since individual tokens don’t have truth values.
Why should they?
You could label each paragraph, for example. Then, when the LM is trained, the correct label could come before each paragraph, as a special token: <true>, <false>, <unknown> and perhaps <mixed>.
Then, during generation, you’d feed it <true> as part of the prompt, and when it generates paragraph breaks.
Similarly, you could do this on a per-sentence basis.