″… a model with more* parameters will always have less residual errors (unless you screw up the prior) and thus the in sample predictions will seem better”
Not always (unless you’re sweeping all exceptions under “unless you screw up the prior”). With more parameters, the prior probability for the region of the parameter space that fits the data well may be smaller, so the posterior may be mostly outside this region. Note that “less residual errors” isn’t avery clear concept in Bayesian terms—there’s a posterior distribution of residual error on the training set, not a single value. (There is a single residual error when making the Bayesian prediction averaged over the posterior, but this residual error also doesn’t necessarily go down when the model becomes more complex.)
“Bayesian Models just like Frequentest Models are vulnerable to over fitting if they have many parameters and weak priors.”
Actually, Bayesian models with many parameters and weak priors tend to under fit the data (assuming that by “weak” you mean “vague” / “high variance”), since the weak priors in a high dimensional space give high prior probability to the data not being fit well.
″… a model with more* parameters will always have less residual errors (unless you screw up the prior) and thus the in sample predictions will seem better”
Not always (unless you’re sweeping all exceptions under “unless you screw up the prior”). With more parameters, the prior probability for the region of the parameter space that fits the data well may be smaller, so the posterior may be mostly outside this region. Note that “less residual errors” isn’t avery clear concept in Bayesian terms—there’s a posterior distribution of residual error on the training set, not a single value. (There is a single residual error when making the Bayesian prediction averaged over the posterior, but this residual error also doesn’t necessarily go down when the model becomes more complex.)
“Bayesian Models just like Frequentest Models are vulnerable to over fitting if they have many parameters and weak priors.”
Actually, Bayesian models with many parameters and weak priors tend to under fit the data (assuming that by “weak” you mean “vague” / “high variance”), since the weak priors in a high dimensional space give high prior probability to the data not being fit well.