Potentially also relevant—Contrastive Preference Learning: Learning from Human Feedback without RL, TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space, Bridging Associative Memory and Probabilistic Modeling.
Potentially also relevant—Contrastive Preference Learning: Learning from Human Feedback without RL, TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space, Bridging Associative Memory and Probabilistic Modeling.