I agree that the ultimate goal is to understand the weights. Seems pretty unclear whether trying to understand the activations is a useful stepping stone towards that. And it’s hard to be sure how relevant theoretical toy example are to that question.
I agree that the ultimate goal is to understand the weights. Seems pretty unclear whether trying to understand the activations is a useful stepping stone towards that. And it’s hard to be sure how relevant theoretical toy example are to that question.