NaiveTortoise comments on A Simple Introduction to Neural Networks

NaiveTortoise 11 Feb 2020 17:50 UTC
1 point
Another possible reason for using squared error is that from a stats perspective, the Bayes (optimal) estimator of the squared error, $E [(X - E [X])^{2}],$ will be the mean of the distribution, whereas the optimal estimator of the MAE will be the median. It’s not clear to me that the mean’s what you want but maybe?