Hi! I’ve had some luck making architectures equivariant to a wider zoo of groups: my most interesting published results are getting a neural network to output a function, and invert that function if the inputs are swapped (equivariant to group of order 2, https://arxiv.org/pdf/2305.00087) and getting a neural network with two inputs to be doubly equivariant to translations: https://arxiv.org/pdf/2405.16738
These are architectural equivariances, and as expected that means they hold out of distribution.
If you need an architecture equivariant to a specific group, I can probably produce that architecture; I’ve got quite the unpublished toolbox building up. In particular, explicit mesa-optimizers are actually easier to make equivariant- if each mesa-optimization step is equivariant to a small group, then the optimization process is tyically equivariant to a larger group
Hi! I’ve had some luck making architectures equivariant to a wider zoo of groups: my most interesting published results are getting a neural network to output a function, and invert that function if the inputs are swapped (equivariant to group of order 2, https://arxiv.org/pdf/2305.00087) and getting a neural network with two inputs to be doubly equivariant to translations: https://arxiv.org/pdf/2405.16738
These are architectural equivariances, and as expected that means they hold out of distribution.
If you need an architecture equivariant to a specific group, I can probably produce that architecture; I’ve got quite the unpublished toolbox building up. In particular, explicit mesa-optimizers are actually easier to make equivariant- if each mesa-optimization step is equivariant to a small group, then the optimization process is tyically equivariant to a larger group