A stated definition of the goals of the model psych research you’re doing
An evaluation of the hypotheses you’re presenting in terms of those goals
Controlled measurements (not only examples)
For example, from the previous post on stochastic parrots, I infer that one of your goal is predicting what capabilities models have. If that is the case, then the evaluation should be “given a specific range of capabilities, predict which of them models will and won’t have”, and the list should be established before any measurement is made, and maybe even before a model is trained/released (since this is where these predictions would be the most useful for AI safety, I’d love to know if the deadly model is GPT-4 or GPT-9).
I don’t know much about human psych, but it seems to me that it is most useful when it describes some behavior quantitatively with controlled predictions (à la CBT), and not when it does qualitatively analysis based on personal experience with the subject (à la Freud).
What I’d like to see in the coming posts:
A stated definition of the goals of the model psych research you’re doing
An evaluation of the hypotheses you’re presenting in terms of those goals
Controlled measurements (not only examples)
For example, from the previous post on stochastic parrots, I infer that one of your goal is predicting what capabilities models have. If that is the case, then the evaluation should be “given a specific range of capabilities, predict which of them models will and won’t have”, and the list should be established before any measurement is made, and maybe even before a model is trained/released (since this is where these predictions would be the most useful for AI safety, I’d love to know if the deadly model is GPT-4 or GPT-9).
I don’t know much about human psych, but it seems to me that it is most useful when it describes some behavior quantitatively with controlled predictions (à la CBT), and not when it does qualitatively analysis based on personal experience with the subject (à la Freud).