I also agree that toy models are better than nothing and we should start with them but I moved away from “if we understand how toy models do optimization, we understand much more about how GPT-4 does optimization”.
I have a bunch of project ideas on how small models do optimization. I even trained the networks already. I just haven’t found the time to interpret them yet. I’m happy for someone to take over the project if they want to. I’m mainly looking for evidence against the outlined hypothesis, i.e. maybe small toy models actually do fairly general optimization. Would def. update my beliefs.
I’d be super interested in falsifiable predictions about what these general-purpose modules look like. Or maybe even just more concrete intuitions, e.g. what kind of general-purpose modules you would expect GPT-3 to have. I’m currently very uncertain about this.
Thank you!
I also agree that toy models are better than nothing and we should start with them but I moved away from “if we understand how toy models do optimization, we understand much more about how GPT-4 does optimization”.
I have a bunch of project ideas on how small models do optimization. I even trained the networks already. I just haven’t found the time to interpret them yet. I’m happy for someone to take over the project if they want to. I’m mainly looking for evidence against the outlined hypothesis, i.e. maybe small toy models actually do fairly general optimization. Would def. update my beliefs.
I’d be super interested in falsifiable predictions about what these general-purpose modules look like. Or maybe even just more concrete intuitions, e.g. what kind of general-purpose modules you would expect GPT-3 to have. I’m currently very uncertain about this.
I agree with your final framing.