What’s weird is the level of conviction of choice J—above 90%. I have no idea why this happens.
text-davinci-002 is often extremely confident about its “predictions” for no apparent good reason (e.g. when generating “open-ended” text being ~99% confident about the exact phrasing)
This is almost certainly due to the RLHF “Instruct” tuning text-davinci-002 has been subjected to. To whatever extent probabilities output by models trained with pure SSL can be assigned an epistemic interpretation (the model’s credence for the next token in a hypothetical training sample), that interpretation no longer holds for models modified by RLHF.
Is this text-davinci-002 or davinci?
text-davinci-002, updated with a link to github
text-davinci-002
text-davinci-002 is often extremely confident about its “predictions” for no apparent good reason (e.g. when generating “open-ended” text being ~99% confident about the exact phrasing)
This is almost certainly due to the RLHF “Instruct” tuning text-davinci-002 has been subjected to. To whatever extent probabilities output by models trained with pure SSL can be assigned an epistemic interpretation (the model’s credence for the next token in a hypothetical training sample), that interpretation no longer holds for models modified by RLHF.