Interesting tests and thanks for sharing. One question: using the model to answer question without context is looking to me as just checking if the learned knowledge is there to answer the question you ask to answer—that´s kind of a Knowledge machine approach—which none of the models are. And therefor comes down to just a scaling question imho—more training data relates to more knowledge relates to more questions being answered. I would be interested: did you try some other approaches like few-shot prompting in order to figure out the learned conceptual and contextual understanding?
No, I didn’t try few-shot prompting yet because it uses much more tokens i.e. credits.
But I also don’t think that few-shot prompting reveals a different kind of understanding. The main advantage seems to be that you can point more robustly towards what you want.
But maybe you can give an example of what you have in mind.
Interesting tests and thanks for sharing. One question: using the model to answer question without context is looking to me as just checking if the learned knowledge is there to answer the question you ask to answer—that´s kind of a Knowledge machine approach—which none of the models are. And therefor comes down to just a scaling question imho—more training data relates to more knowledge relates to more questions being answered. I would be interested: did you try some other approaches like few-shot prompting in order to figure out the learned conceptual and contextual understanding?
No, I didn’t try few-shot prompting yet because it uses much more tokens i.e. credits.
But I also don’t think that few-shot prompting reveals a different kind of understanding. The main advantage seems to be that you can point more robustly towards what you want.
But maybe you can give an example of what you have in mind.