I think in feed-forward networks (i.e. they don’t re-use the same neuron multiple times), having to learn all the kij inhibition coefficients is too much to ask. RNNs have gone in an out of fashion, and maybe they could use something like this (maybe scaled down a little), but you could achieve similar inhibition effects with multiple different architectures—LSTMs already have multiplication built into them, but in a different way. There is not a particularly deep technical reason for different choices.
I think in feed-forward networks (i.e. they don’t re-use the same neuron multiple times), having to learn all the kij inhibition coefficients is too much to ask. RNNs have gone in an out of fashion, and maybe they could use something like this (maybe scaled down a little), but you could achieve similar inhibition effects with multiple different architectures—LSTMs already have multiplication built into them, but in a different way. There is not a particularly deep technical reason for different choices.