in some sense, the model is underparametrized from the point of view of types of generalizing circuits
That is a pretty interesting idea! I’ll be interested to see if it works out. It seems like it’s possibly in tension with an SLT-like frame, where the multiple representation of generalising circuits is (in my limited understanding from a couple of hours of explanation) is a big part of the picture. Though the details are a little fuzzy.
To be clear, I have only cursory familiarity with SLT. But my thought is we have something like:
Claim: the mechanism that favours generalising circuits involves the fact that symmetries mean they are overrepresented in the parameter space
Claim: generalising algorithms are underrepresented in the parameter space
Which seem to be in tension. Perhaps the synthesis is that only a few of the generalising algorithms are represented, but those that are are represented many times.
That is a pretty interesting idea! I’ll be interested to see if it works out. It seems like it’s possibly in tension with an SLT-like frame, where the multiple representation of generalising circuits is (in my limited understanding from a couple of hours of explanation) is a big part of the picture. Though the details are a little fuzzy.
Interesting—what SLT prediction do you think is relevant here?
To be clear, I have only cursory familiarity with SLT. But my thought is we have something like:
Claim: the mechanism that favours generalising circuits involves the fact that symmetries mean they are overrepresented in the parameter space
Claim: generalising algorithms are underrepresented in the parameter space
Which seem to be in tension. Perhaps the synthesis is that only a few of the generalising algorithms are represented, but those that are are represented many times.