I’m also most nervous about this way of modeling limitation (2)/(3), since it seems like it leads directly to the conclusion “fine-tuning always trades off truthfulness and persuasion, but conditioning can improve both.”
I’m also most nervous about this way of modeling limitation (2)/(3), since it seems like it leads directly to the conclusion “fine-tuning always trades off truthfulness and persuasion, but conditioning can improve both.”