2) I think this is the distinction you are trying to make between the lattice model and the smoker model: in the lattice model, the equations and parameters are defined, whereas in the smoker model, the equations and parameters have to be deduced. Is that right? If so, my previous posts were referring to the smoker-type model.
Well, the real thing is that (again in the toy metamodel) you consider the complete ensemble of smoker-type models and let them fight it out for good scores when compared to the evidence. I guess you can consider this process to be deduction, sure.
3) (in response to the very end) That would be at the point where 1 bit of internal representation costs 1⁄2 of prior probability. If it was ‘minimize (reconstruction error + 2*representation size)’ then that would be a ‘temperature’ half that, where 1 more bit of internal representation costs a factor of 1⁄4 in prior probability. Colder thus corresponds to wanting your models smaller at the expense of accuracy. Sort of backwards from the usual way temperature is used in simulated annealing of MCMC systems.
Well, the real thing is that (again in the toy metamodel) you consider the complete ensemble of smoker-type models and let them fight it out for good scores when compared to the evidence. I guess you can consider this process to be deduction, sure.
3) (in response to the very end) That would be at the point where 1 bit of internal representation costs 1⁄2 of prior probability. If it was ‘minimize (reconstruction error + 2*representation size)’ then that would be a ‘temperature’ half that, where 1 more bit of internal representation costs a factor of 1⁄4 in prior probability. Colder thus corresponds to wanting your models smaller at the expense of accuracy. Sort of backwards from the usual way temperature is used in simulated annealing of MCMC systems.
I see. You’re treating “energy” as the information required to specify a model. Your analogy and your earlier posts make sense now.