Theoretically you should always start with the Solomonoff prior and update from there but implementing it in practice is difficult, to say the least.
Yes, but you can check your models in a variety of ways. You can test your inferred results from your dataset by doing bootstrapping or cross-validation, and see how often your result changed (coefficients or estimation accuracy etc). To step up a level, you can set parameters in your model to differing values based on hyperparameters, and see how each of the variants on the model performs on the data (and then you can bootstrap/cross-validate each of the possible models as well), and then see how sensitive your results are to specific parameters like, yes, whatever priors you were feeding in. You can have families of models, like pitting logistic regression models against random forests, and you can see how often they differ as another form of sensitivity (and then you can vary the hyperparameters in each model and then bootstrap/cross-validate with each possible model). You can have ensembles of models from various families and obviously vary which models are picked and what weights are put on them… and there my knowledge peters out.
But while you still would not have come close to what a Solomonoff approach might do, you have still learned a great deal about your model’s reliability in a way that I can’t see as having any connection with your time and KL-related approach.
But while you still would not have come close to what a Solomonoff approach might do, you have still learned a great deal about your model’s reliability in a way that I can’t see as having any connection with your time and KL-related approach.
I think there is a connection. Namely, the methods you mentioned are possible mechanisms of a learning process but
]) is a quantification of the expected impact of this learning process.
Yes, I see what you mean—the mean/expectation of how big the divergence between our current probability distribution and the future probability distribution—but this seems like a post hoc or purely descriptive approach: how do we estimate how much divergence there may be?
Having gotten estimates of future divergence, quantifying the divergence may then be useful, but it seems like putting the horse before the cart to start with your measure.
Yes, but you can check your models in a variety of ways. You can test your inferred results from your dataset by doing bootstrapping or cross-validation, and see how often your result changed (coefficients or estimation accuracy etc). To step up a level, you can set parameters in your model to differing values based on hyperparameters, and see how each of the variants on the model performs on the data (and then you can bootstrap/cross-validate each of the possible models as well), and then see how sensitive your results are to specific parameters like, yes, whatever priors you were feeding in. You can have families of models, like pitting logistic regression models against random forests, and you can see how often they differ as another form of sensitivity (and then you can vary the hyperparameters in each model and then bootstrap/cross-validate with each possible model). You can have ensembles of models from various families and obviously vary which models are picked and what weights are put on them… and there my knowledge peters out.
But while you still would not have come close to what a Solomonoff approach might do, you have still learned a great deal about your model’s reliability in a way that I can’t see as having any connection with your time and KL-related approach.
I think there is a connection. Namely, the methods you mentioned are possible mechanisms of a learning process but
]) is a quantification of the expected impact of this learning process.Yes, I see what you mean—the mean/expectation of how big the divergence between our current probability distribution and the future probability distribution—but this seems like a post hoc or purely descriptive approach: how do we estimate how much divergence there may be?
Having gotten estimates of future divergence, quantifying the divergence may then be useful, but it seems like putting the horse before the cart to start with your measure.