nostalgebraist comments on larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist 30 Nov 2021 18:28 UTC
4 points
I don’t know.
I poked around on Google Scholar for a bit, trying to answer these questions, and managed to learn the following:
- The term “few-shot learning” seems to have become widespread sometime around 2017.
  - The term is used in a bunch of papers from 2017 (example, another example)
- Before 2017, it’s hard to find usage of “few-shot” but easy to find usage of “one-shot.” (Example, example)
  - The “one-shot” term dates back at least as far as 2003. Work before 2010 tends to lump what we would call “one-shot” and “few-shot” into a single category, as in this paper (2006) and this one (2003).
- Lone Pine 1 Dec 2021 6:36 UTC
  1 point
  Parent
  Thanks for looking into it! It’s really interesting to see computer vision research from before the deep learning revolution.
  - Kaj_Sotala 3 Dec 2021 17:18 UTC
    10 points
    Parent
    Here’s a 1984 paper that uses the term “one-shot”, apparently in the same sense as today.
    A 1987 paper mentions that
    Learning can be achieved by a one-shot learning process (in which each prototype is presented only once) as follows
    and then after some math notes that
    Several authors [2] have proposed the latter relation for one-shot learning, taking into account all second-order interactions, and have investigated the storage capacity, size of the basins of attraction, etc., for random uncorrelated patterns.
    suggesting that there was already moderately active discussion of the term in the eighties.