Do you mean you’d be adding the probability distribution with that covariance matrix on top of the mean prediction from f, to make it a probabilistic prediction? I was talking about deterministic predictions before, though my text doesn’t make that clear. For probabilistic models, yes adding an uncertainty distribution may make result in non-zero likelihoods. But if we know the true dynamics are deterministic (pretend there’s no quantum effects, which are largely irrelevant for our prediction errors for systems in the classical physics domain), then we still know the model is not true, and so it seems difficult to interpret p if we were to do Bayesian updating.
Likelihoods are also not obviously (to me) very good measures of model quality for chaotic systems, either—in these cases we know that even if we had the true model, its predictions would diverge from reality due to errors in the initial condition estimates, but it would trace out the correct attractor—and its the attractor geometry (conditional on boundary conditions) that we’d seem to really want to assess. Perhaps then it would have a higher likelihood than every other model, but it’s not obvious to me, and it’s not obvious that there’s not a better metric for leading to good inferences when we don’t have the true model.
Basically the logic that says to use Bayes for deducing the truth does not seem to carry over in an obvious way (to me) to the case when we want to predict but can’t use the true model.
But if we know the true dynamics are deterministic (pretend there’s no quantum effects, which are largely irrelevant for our prediction errors for systems in the classical physics domain), then we still know the model is not true, and so it seems difficult to interpret p if we were to do Bayesian updating.
Ah, that’s where we need to apply more Bayes. The underlying system may be deterministic at the macroscopic level, but that does not mean we have perfect knowledge of all the things which effect the system’s trajectory. Most of the uncertainty in e.g. a weather model would not be quantum noise, it would be things like initial conditions, measurement noise (e.g. how close is this measurement to the actual average over this whole volume?), approximation errors (e.g. from discretization of the dynamics), driving conditions (are we accounting for small variations in sunlight or tidal forces?), etc. The true dynamics may be deterministic, but that doesn’t mean that our estimates of all the things which go into those dynamics have no uncertainty. If the inputs have uncertainty (which of course they do), then the outputs also have uncertainty.
The main point of probabilistic models is not to handle “random” behavior in the environment, it’s to quantify uncertainty resulting from our own (lack of) knowledge of the system’s inputs/parameters.
Likelihoods are also not obviously (to me) very good measures of model quality for chaotic systems, either—in these cases we know that even if we had the true model, its predictions would diverge from reality due to errors in the initial condition estimates, but it would trace out the correct attractor...
Yeah, you’re pointing to an important issue here, although it’s not actually likelihoods which are the problem—it’s point estimates. In particular, that makes linear approximations a potential issue, since they’re implicitly approximations around a point estimate. Something like a particle filter will do a much better job than a Kalman filter at tracing out an attractor, since it accounts for nonlinearity much better.
Anyway, reasoning with likelihoods and posterior distributions remains valid regardless of whether we’re using point estimates. When the system is chaotic but has an attractor, the posterior probability of the system state will end up smeared pretty evenly over the whole attractor. (Although with enough fine-grained data, we can keep track of roughly where on the attractor the system is at each time, which is why Kalman-type models work well in that case.)
Do you mean you’d be adding the probability distribution with that covariance matrix on top of the mean prediction from f, to make it a probabilistic prediction? I was talking about deterministic predictions before, though my text doesn’t make that clear. For probabilistic models, yes adding an uncertainty distribution may make result in non-zero likelihoods. But if we know the true dynamics are deterministic (pretend there’s no quantum effects, which are largely irrelevant for our prediction errors for systems in the classical physics domain), then we still know the model is not true, and so it seems difficult to interpret p if we were to do Bayesian updating.
Likelihoods are also not obviously (to me) very good measures of model quality for chaotic systems, either—in these cases we know that even if we had the true model, its predictions would diverge from reality due to errors in the initial condition estimates, but it would trace out the correct attractor—and its the attractor geometry (conditional on boundary conditions) that we’d seem to really want to assess. Perhaps then it would have a higher likelihood than every other model, but it’s not obvious to me, and it’s not obvious that there’s not a better metric for leading to good inferences when we don’t have the true model.
Basically the logic that says to use Bayes for deducing the truth does not seem to carry over in an obvious way (to me) to the case when we want to predict but can’t use the true model.
Ah, that’s where we need to apply more Bayes. The underlying system may be deterministic at the macroscopic level, but that does not mean we have perfect knowledge of all the things which effect the system’s trajectory. Most of the uncertainty in e.g. a weather model would not be quantum noise, it would be things like initial conditions, measurement noise (e.g. how close is this measurement to the actual average over this whole volume?), approximation errors (e.g. from discretization of the dynamics), driving conditions (are we accounting for small variations in sunlight or tidal forces?), etc. The true dynamics may be deterministic, but that doesn’t mean that our estimates of all the things which go into those dynamics have no uncertainty. If the inputs have uncertainty (which of course they do), then the outputs also have uncertainty.
The main point of probabilistic models is not to handle “random” behavior in the environment, it’s to quantify uncertainty resulting from our own (lack of) knowledge of the system’s inputs/parameters.
Yeah, you’re pointing to an important issue here, although it’s not actually likelihoods which are the problem—it’s point estimates. In particular, that makes linear approximations a potential issue, since they’re implicitly approximations around a point estimate. Something like a particle filter will do a much better job than a Kalman filter at tracing out an attractor, since it accounts for nonlinearity much better.
Anyway, reasoning with likelihoods and posterior distributions remains valid regardless of whether we’re using point estimates. When the system is chaotic but has an attractor, the posterior probability of the system state will end up smeared pretty evenly over the whole attractor. (Although with enough fine-grained data, we can keep track of roughly where on the attractor the system is at each time, which is why Kalman-type models work well in that case.)