Vivek Hebbar comments on Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

Vivek Hebbar 9 Feb 2024 23:30 UTC
6 points
2
Can you clarify what figure 1 and figure 2 are showing?
I took the text description before figure 1 to mean {score on column after finetuning on 200 from row then 10 from column} - {score on column after finetuning on 10 from column}. But then the text right after says “Babbage fine-tuned on addition gets 27% accuracy on the multiplication dataset” which seems like a different thing.
- agg 14 Feb 2024 0:41 UTC
  1 point
  0
  Parent
  Position i, j in figure 1 represents how well a model fine-tuned on 200 examples of dataset i performs on dataset j;
  Position i, j in figure 2 represents how well a model fine-tuned on 200 examples of dataset i, and then fine-tuned on 10 examples of dataset j, performs on dataset j.