Did you see my earlier post? Your explanation seems related to mine, but I also said that generalization capability depends on activation function choice. (There are lots of activation functions that don’t work for neural networks, in different ways.)
I think I recall reading that, but I’m not completely sure.
Note that the activation function affects the parameter-function map, and so the influence of the activation function is subsumed by the general question of what the parameter-function map looks like.
Did you see my earlier post? Your explanation seems related to mine, but I also said that generalization capability depends on activation function choice. (There are lots of activation functions that don’t work for neural networks, in different ways.)
I think I recall reading that, but I’m not completely sure.
Note that the activation function affects the parameter-function map, and so the influence of the activation function is subsumed by the general question of what the parameter-function map looks like.