Richard_Ngo comments on Brains and backprop: a key timeline crux

Richard_Ngo 12 Mar 2018 15:55 UTC
1 point
Single neurons cannot represent two distinct kind of quantities, as would be required to do backprop (the presence of features and gradients for training).
I don’t understand why can’t you just have some neurons which represent the former, and some neurons which represent the latter?
The drop-out algorithm (which has been very popular, though it recently seems to have been largely replaced by batch normalisation).
Do you have any particular source for dropout being replaced by batch normalisation, or is it an impression from the papers you’ve been reading?
- jacobjacob 13 Mar 2018 17:12 UTC
  3 points
  Parent
  I don’t understand why can’t you just have some neurons which represent the former, and some neurons which represent the latter?
  Because people thought you needed the same weights to 1) transport the gradients back, 2) send the activations forward. Having two distinct networks with the same topology and getting the weights to match was known as the “weight transport problem”. See Grossberg, S. 1987. Competitive learning: From interactive activation to adaptive resonance. Cognitive science 11(1):23–63.
  Do you have any particular source for dropout being replaced by batch normalisation, or is it an impression from the papers you’ve been reading?
  The latter.