Haziq Muhammad comments on Question about Test-sets and Bayesian machine learning

Haziq Muhammad 9 Aug 2021 22:23 UTC
1 point
0
It seems to me like there are two distinct issues: estimating error of model on future data and model comparison.
$1 ⟼$ It would be useful to know the most likely value of error on an future data before we actually use the model; but is this what test set error represents—the most likely value of error on future data?
$2 ⟼$ Why do we use techniques like WAIC and PSIS-LOO when we can (and should?) simply use $p (M | D)$ i.e. Bayes factors, Ockham factors, Model Evidence, etc.? These techniques seem to work well for over-fitting (see image below). Once we find the more plausible model, we use it to make predictions