Sometimes people will give GPT-3 a prompt with some examples of inputs along with the sorts of responses they’d like to see from GPT-3 in response to those inputs (“few-shot learning”, right? I don’t know what 0-shot learning you’re referring to.)
No, that’s zero-shot. Few shot is when you train on those instead of just stuffing them into the context.
It looks like mesa-optimization because it seems to be doing something like learning about new tasks or new prompts that are very different from anything its seen before, without any training, just based on the context (0-shot).
Is your claim that GPT-3 succeeds at this sort of task by doing something akin to training a model internally?
By “training a model”, I assume you mean “a ML model” (as opposed to, e.g. a world model). Yes, I am claiming something like that, but learning vs. inference is a blurry line.
I’m not saying it’s doing SGD; I don’t know what it’s doing in order to solve these new tasks. But TBC, 96 steps of gradient descent could be a lot. MAML does meta-learning with 1.
No, that’s zero-shot. Few shot is when you train on those instead of just stuffing them into the context.
It looks like mesa-optimization because it seems to be doing something like learning about new tasks or new prompts that are very different from anything its seen before, without any training, just based on the context (0-shot).
By “training a model”, I assume you mean “a ML model” (as opposed to, e.g. a world model). Yes, I am claiming something like that, but learning vs. inference is a blurry line.
I’m not saying it’s doing SGD; I don’t know what it’s doing in order to solve these new tasks. But TBC, 96 steps of gradient descent could be a lot. MAML does meta-learning with 1.
Thanks!