The paper’s first author, beren, left a detailed comment on the ACX linkpost, painting a more nuanced and uncertain (though possibly outdated by now?) picture. To quote the last paragraph:
“The brain being able to do backprop does not mean that the brain is just doing gradient descent like we do to train ANNs. It is still very possible (in my opinion likely) that the brain could be using a more powerful algorithm for inference and learning—just one that has backprop as a subroutine. Personally (and speculatively) I think it’s likely that the brain performs some highly parallelized advanced MCMC algorithm like Hamiltonian MCMC where each neuron or small group of neurons represents a single ‘particle’ following its own MCMC path. This approach naturally uses the stochastic nature of neural computation to its advantage, and allows neural populations to represent the full posterior distribution rather than just a point prediction as in ANNs.”
One of his subcomments went into more detail on this point.
The paper’s first author, beren, left a detailed comment on the ACX linkpost, painting a more nuanced and uncertain (though possibly outdated by now?) picture. To quote the last paragraph:
“The brain being able to do backprop does not mean that the brain is just doing gradient descent like we do to train ANNs. It is still very possible (in my opinion likely) that the brain could be using a more powerful algorithm for inference and learning—just one that has backprop as a subroutine. Personally (and speculatively) I think it’s likely that the brain performs some highly parallelized advanced MCMC algorithm like Hamiltonian MCMC where each neuron or small group of neurons represents a single ‘particle’ following its own MCMC path. This approach naturally uses the stochastic nature of neural computation to its advantage, and allows neural populations to represent the full posterior distribution rather than just a point prediction as in ANNs.”
One of his subcomments went into more detail on this point.