Tagging is conditioning (see also). If instead of having SSL model learn text, you have it learn (summary, text), then it learns to predict text from summary. A summary can be any measurement of text such that some values of that measurement can be a useful query. For example, if text is a chess game, a summary could be a statement of who wins. Then, starting a prompt with a claim of winning will tend to lead to a winning game. Similarly, if summary says if text is a valid argument, you gain the ability to query for valid arguments. Finally, summary can well be written by a language model, using a prompt that includes text and instructions to write an appropriate summary. If summary is a free-form text description, queries can become arbitrary texts as well, including those not appearing in the training dataset.
Tagging is conditioning (see also). If instead of having SSL model learn
text
, you have it learn(summary, text)
, then it learns to predicttext
fromsummary
. A summary can be any measurement of text such that some values of that measurement can be a useful query. For example, iftext
is a chess game, asummary
could be a statement of who wins. Then, starting a prompt with a claim of winning will tend to lead to a winning game. Similarly, ifsummary
says iftext
is a valid argument, you gain the ability to query for valid arguments. Finally,summary
can well be written by a language model, using a prompt that includestext
and instructions to write an appropriate summary. Ifsummary
is a free-form text description, queries can become arbitrary texts as well, including those not appearing in the training dataset.