For example, take the L2 norms of the activations of all entries of xi, averaged over some set of network inputs. The sum and product of those norms will both be coordinate independent.
That would be true if the only coordinate changes we consider are rotations. But the post is talking about much more general transformations than that—we’re allowing not only general linear transformations (i.e. stretching in addition to rotations), but also nonlinear transformations (which is why RELUs don’t give a preferred coordinate system).
Hm, stretching seems handleable. How about also using the weight matrix, for example? Change into the eigenbasis above, then apply stretching to make all L2 norms size 1 or size 0. Then look at the weights, as stretching-and-rotation invariant quantifiers of connectedness?
Maybe doesn’t make much sense when considering non-linear transformations though.
Sai, who is a lot more topology-savy than me, now suspects that there is indeed a connection between this norm approach and the topology of the intermediate set. We’ll look into this.
That would be true if the only coordinate changes we consider are rotations. But the post is talking about much more general transformations than that—we’re allowing not only general linear transformations (i.e. stretching in addition to rotations), but also nonlinear transformations (which is why RELUs don’t give a preferred coordinate system).
Ah, right, you did mention polar coordinates.
Hm, stretching seems handleable. How about also using the weight matrix, for example? Change into the eigenbasis above, then apply stretching to make all L2 norms size 1 or size 0. Then look at the weights, as stretching-and-rotation invariant quantifiers of connectedness?
Maybe doesn’t make much sense when considering non-linear transformations though.
I think that’s the same as finding a low-rank decomposition, assuming I correctly understand what you’re saying?
Sai, who is a lot more topology-savy than me, now suspects that there is indeed a connection between this norm approach and the topology of the intermediate set. We’ll look into this.