gwern comments on Inverse Scaling Prize: Second Round Winners

gwern 14 Mar 2023 17:26 UTC
4 points
0
GPT-4 (discussion) has been released and performs much better than PaLM/U-PaLM, and as predicted, there is also U-scaling with GPT-4 rather than GPT-3/GPT-3.5:

Some capabilities are still hard to predict. For example, the Inverse Scaling Prize was a competition to find a metric that gets worse as model compute increases, and “hindsight neglect” was one of the winners. Just like with another recent result, GPT-4 reverses the trend:

[Inverse Scaling Prize, hindsight neglect: GPT-4 goes to ~100%]

(Paper doesn’t seem to provide any additional information on inverse-scaling.)
- Tapatakt 14 Mar 2023 17:56 UTC
  3 points
  0
  Parent
  It is not clear if this happened on its own, or if they deliberately trained the model not to make such mistakes.
  Perhaps, in similar future studies, it is worth keeping half of the found tasks in secret in order to test future models with them.