I’d love to gather more inputs on different conditions for equivalences (and non-equivalences!) like this.
While writing up this proof, I came across Whitelam et al—Correspondence between neuroevolution and gradient descent, a paper in Nature Communications (https://www.nature.com/articles/s41467-021-26568-2) which, upon skimming, appears to reach a similar, though less general conclusion. I have some trouble understanding their notation though.
It’s not a mathematical argument, but here I first came across such an analogy drawn between training of neural networks and evolution, and a potential interpretation of what it means in terms of sample-(in)efficiency.
Other resources
I’d love to gather more inputs on different conditions for equivalences (and non-equivalences!) like this.
While writing up this proof, I came across Whitelam et al—Correspondence between neuroevolution and gradient descent, a paper in Nature Communications (https://www.nature.com/articles/s41467-021-26568-2) which, upon skimming, appears to reach a similar, though less general conclusion. I have some trouble understanding their notation though.
It’s not a mathematical argument, but here I first came across such an analogy drawn between training of neural networks and evolution, and a potential interpretation of what it means in terms of sample-(in)efficiency.