Before 2017, it’s hard to find usage of “few-shot” but easy to find usage of “one-shot.” (Example, example)
The “one-shot” term dates back at least as far as 2003. Work before 2010 tends to lump what we would call “one-shot” and “few-shot” into a single category, as in this paper (2006) and this one (2003).
Learning can be achieved by a one-shot learning process (in which each prototype is presented only once) as follows
and then after some math notes that
Several authors [2] have proposed the latter relation for one-shot learning, taking into account all second-order interactions, and have investigated the storage capacity, size of the basins of attraction, etc., for random uncorrelated patterns.
suggesting that there was already moderately active discussion of the term in the eighties.
I don’t know.
I poked around on Google Scholar for a bit, trying to answer these questions, and managed to learn the following:
The term “few-shot learning” seems to have become widespread sometime around 2017.
The term is used in a bunch of papers from 2017 (example, another example)
Before 2017, it’s hard to find usage of “few-shot” but easy to find usage of “one-shot.” (Example, example)
The “one-shot” term dates back at least as far as 2003. Work before 2010 tends to lump what we would call “one-shot” and “few-shot” into a single category, as in this paper (2006) and this one (2003).
Thanks for looking into it! It’s really interesting to see computer vision research from before the deep learning revolution.
Here’s a 1984 paper that uses the term “one-shot”, apparently in the same sense as today.
A 1987 paper mentions that
and then after some math notes that
suggesting that there was already moderately active discussion of the term in the eighties.