Is the meta-learned net able to learn any other function at all and is not just frozen, or is the meta-learned stability tailored to protecting against specific tasks like sin(x)?
If my hypothesis about what the model is actually doing internally is correct, then it shouldn’t work with anything other than a constant function. I’d be interested in seeing a version of this experiment but with, say, cos x and sin x or something.
I checked the intermediate network activations. It turns out the meta-learned network generates all-negative activations for the final linear layer, so the the relu activations zero out the final layer’s output (other than bias), regardless of initial network input. You’re right about it only working for constant functions, due to relu saturation and not changes to the batchnorm layers.
I’ve begun experiments with flipped base and meta functions (network initially models sin(x) and resists being retrained to model f(x) = 1).
Is the meta-learned net able to learn any other function at all and is not just frozen, or is the meta-learned stability tailored to protecting against specific tasks like sin(x)?
If my hypothesis about what the model is actually doing internally is correct, then it shouldn’t work with anything other than a constant function. I’d be interested in seeing a version of this experiment but with, say, cos x and sin x or something.
I checked the intermediate network activations. It turns out the meta-learned network generates all-negative activations for the final linear layer, so the the relu activations zero out the final layer’s output (other than bias), regardless of initial network input. You’re right about it only working for constant functions, due to relu saturation and not changes to the batchnorm layers.
I’ve begun experiments with flipped base and meta functions (network initially models sin(x) and resists being retrained to model f(x) = 1).