I don’t have the expertise to actually test this, but my expectation is:
Answering questions, where the available training data has a long history of unstated assumptions of one answer, but the reality is a different, recently documented, answer.
For example: how many human genders are there?
Or, this could be tested historically by omitting recent data from the training corpus, and asking questions that were “recently” explicitly answered, but implicitly gotten wrong for a great majority of the training set.
This seems like a good idea :) We tried to make it as easy as possible to make a dataset and measure inverse scaling, so I’d encourage you to give it a shot! You’ll just need to make your dataset e.g. in a google spreadsheet, download it, and run our Google Colab on it to evaluate it with various sized GPT3 models (see here for more details). Feel free to join our Slack as well to ask us questions about how to run things more easily
It should work if your laptop has a browser (where Google Colab runs) - the code executes remotely on Google’s machines/GPUs, and the results are just sent back to your browser
I don’t have the expertise to actually test this, but my expectation is: Answering questions, where the available training data has a long history of unstated assumptions of one answer, but the reality is a different, recently documented, answer. For example: how many human genders are there? Or, this could be tested historically by omitting recent data from the training corpus, and asking questions that were “recently” explicitly answered, but implicitly gotten wrong for a great majority of the training set.
This seems like a good idea :) We tried to make it as easy as possible to make a dataset and measure inverse scaling, so I’d encourage you to give it a shot! You’ll just need to make your dataset e.g. in a google spreadsheet, download it, and run our Google Colab on it to evaluate it with various sized GPT3 models (see here for more details). Feel free to join our Slack as well to ask us questions about how to run things more easily
“Run our Google Colab” on it sounds like something my laptop couldn’t handle. Also, <inset Shitts Creek “Fold it in” clip here>
It should work if your laptop has a browser (where Google Colab runs) - the code executes remotely on Google’s machines/GPUs, and the results are just sent back to your browser