lsusr comments on Technical Predictions Related to AI Safety

lsusr 23 Aug 2021 12:54 UTC
2 points
What you mean by “solving the small data problem is useful for loss”?
- Rohin Shah 23 Aug 2021 13:10 UTC
  9 points
  Parent
  If you want to e.g. predict text on the Internet, you can do a better job of it if you can solve small data problems than if you can’t.
  For example, in the following text (which I copied from here):
  “Look carefully for the pattern, and then choose which pair of numbers comes next.
  42 40 38 35 33 31 28
  A. 25 22
  B. 26 23
  C. 26 24
  D. 25 23
  E. 26 22
  Answer & Explanation:
  Answer: Option”
  You will do a better job at predicting the next token if you can learn the pattern from the given sequence of 7 numbers.
  This is a very very small benefit in absolute terms, but once you get to very very large models that is the sort of thing you learn.
  I expect a similar thing will be true for whichever small-data problems you have in mind (though they may require models that can have more context than GPT-3 can have).