This is also something that depends in part on tokenization, since that changes how the model “sees” letters. We shouldn’t assume that text-davinci-002 uses the same input method as base davinci (even if it uses the same output tokenization method). It is curiously much better at rhyming and arithmetic, anecdotally...
This is also something that depends in part on tokenization, since that changes how the model “sees” letters. We shouldn’t assume that
text-davinci-002
uses the same input method as basedavinci
(even if it uses the same output tokenization method). It is curiously much better at rhyming and arithmetic, anecdotally...