the numerical precision limits due to floating-point arithmetic was an illustrative example that upper bounds the fidelity of climate models
Yes, this is technically correct but I struggle to find this meaningful. Any kind of model or even of a calculation which uses real numbers (and therefore floating-point values) is subject to the same upper bounds.
knowing that there is an upper bound, and that it’s low enough that it might be relevant, can be enough to guide action.
Well, of course there is an upper bound. What I contest is that the bound imposed by the floating-point precision is relevant here. I am also not sure what kind of guide do you expect it to be.
this problem has at least one conceptually straightforward solution
In reality things are considerably more complicated. First, you assume that you can arbitrarily reduce the input uncertainty by sufficient sampling from the input distribution. The problem is that you don’t know the true input distribution. Instead you have an estimate which itself is a model and as such is different from the underlying reality. Repeated sampling from this estimated distribution can get you arbitrarily close to your estimate, but it won’t get you arbitrarily close to the underlying true values because you don’t know what they are.
Second, there are many sources of uncertainty. Let me list some.
The process stability. When you model some process you typically assume that certain characteristcs of it are stable, that is, they do not change over either your fit period or your forecasting period. That is not necessarily true but is a necessary assumption to build a reasonable model.
The sample. Normally you don’t have exhaustive data over the lifetime of the process you’re trying to model. You have a sample and then you estimate things (like distributions) from the sample that you have. The estimates are, of course, subject to some error.
The model uncertainty. All models are wrong in that they are not a 1:1 match to reality. The goal of modeling is to make the “wrongness” of the model acceptably low, but it will never go away completely. This is actually a biggie when you cycle your model—the model error accumulates at each iteration.
Black swan events. The fact something didn’t occur in the history visible to you is not a guarantee that it won’t occur in the future—but your ability to model the impact of such an event is very limited.
Calculation noise seems separate from parameter input uncertainty to me because it enters into this process separately.
This is true. My contention is in most modeling (climate models, certainly) other sources of noise completely dominate over the calculation noise.
we no longer have the guarantee that the present determines the future
You don’t have such a guarantee to start with. Specifically, there is no guarantee whatsoever that your model if run with infinite-precision calculations will adquately represent the future.
This is true. My contention is in most modeling (climate models, certainly) other sources of noise completely dominate over the calculation noise.
The more I think about this, the less sure I am about how true this is. I was initially thinking that the input and model uncertainties are very large. But I think Vaniver is right and this depends on the particulars of the implementation. The differences between different simulation codes for nominally identical inputs can be surprising. Both are large. (I am thinking in particular about fluid dynamics here, but it’s basically the same equations as in weather and climate modeling, so I assume my conclusions carry over as well.)
One weird idea that comes from this: You could use an approach like MILES in fluid dynamics where you treat the numerical error as a model, which could reduce uncertainty. This only makes sense in turbulence modeling and would take more time than I have to explain.
I was initially thinking that the input and model uncertainties are very large. But I think Vaniver is right and this depends on the particulars of the implementation.
I am not a climatologist, but I have a hard time imagining how the input and model uncertainties in a climate model can be driven down to the magnitudes where floating-point precision starts to matter.
If I’m reading Vaniver correctly (or possibly I’m steelmanning his argument without realizing it), he’s using round-off error (as it’s called in scientific computing) as an example of one of several numerical errors, e.g., discretization and truncation. There are further subcategories like dispersion and dissipation (the latter is the sort of “model” MILES provides for turbulent dissipation). I don’t think round-off error usually is the dominant factor, but the other numerical errors can be, and this might often be the case in fluid flow simulations on more modest hardware.
Round-off error can accumulate to dominate the numerical error if you do things wrong. See figure 38.5 for a representative illustration of the total numerical error as a function of time step. If the time step becomes very small, total numerical error actually increases due to build-up of round-off error. As I said, this only happens if you do things wrong, but it can happen.
Yes, I understand all that, but this isn’t the issue. The issue is how much all the assorted calculation errors matter in comparison to the rest of the uncertainty in the model.
I don’t think we disagree too much. If I had to pick one, I’d agree with you that the rest of the uncertainty is likely larger in most cases, but I think you substantially underestimate how inaccurate these numerical methods can be. Many commercial computational fluid dynamics codes use quite bad numerical methods along with large grid cells and time steps, so it seems possible to me that those errors can exceed the uncertainties in the other parameters. I can think of one case in particular in my own work where the numerical errors likely exceed the other uncertainties.
Yes, this is technically correct but I struggle to find this meaningful. Any kind of model or even of a calculation which uses real numbers (and therefore floating-point values) is subject to the same upper bounds.
Well, of course there is an upper bound. What I contest is that the bound imposed by the floating-point precision is relevant here. I am also not sure what kind of guide do you expect it to be.
In reality things are considerably more complicated. First, you assume that you can arbitrarily reduce the input uncertainty by sufficient sampling from the input distribution. The problem is that you don’t know the true input distribution. Instead you have an estimate which itself is a model and as such is different from the underlying reality. Repeated sampling from this estimated distribution can get you arbitrarily close to your estimate, but it won’t get you arbitrarily close to the underlying true values because you don’t know what they are.
Second, there are many sources of uncertainty. Let me list some.
The process stability. When you model some process you typically assume that certain characteristcs of it are stable, that is, they do not change over either your fit period or your forecasting period. That is not necessarily true but is a necessary assumption to build a reasonable model.
The sample. Normally you don’t have exhaustive data over the lifetime of the process you’re trying to model. You have a sample and then you estimate things (like distributions) from the sample that you have. The estimates are, of course, subject to some error.
The model uncertainty. All models are wrong in that they are not a 1:1 match to reality. The goal of modeling is to make the “wrongness” of the model acceptably low, but it will never go away completely. This is actually a biggie when you cycle your model—the model error accumulates at each iteration.
Black swan events. The fact something didn’t occur in the history visible to you is not a guarantee that it won’t occur in the future—but your ability to model the impact of such an event is very limited.
This is true. My contention is in most modeling (climate models, certainly) other sources of noise completely dominate over the calculation noise.
You don’t have such a guarantee to start with. Specifically, there is no guarantee whatsoever that your model if run with infinite-precision calculations will adquately represent the future.
The more I think about this, the less sure I am about how true this is. I was initially thinking that the input and model uncertainties are very large. But I think Vaniver is right and this depends on the particulars of the implementation. The differences between different simulation codes for nominally identical inputs can be surprising. Both are large. (I am thinking in particular about fluid dynamics here, but it’s basically the same equations as in weather and climate modeling, so I assume my conclusions carry over as well.)
One weird idea that comes from this: You could use an approach like MILES in fluid dynamics where you treat the numerical error as a model, which could reduce uncertainty. This only makes sense in turbulence modeling and would take more time than I have to explain.
I am not a climatologist, but I have a hard time imagining how the input and model uncertainties in a climate model can be driven down to the magnitudes where floating-point precision starts to matter.
If I’m reading Vaniver correctly (or possibly I’m steelmanning his argument without realizing it), he’s using round-off error (as it’s called in scientific computing) as an example of one of several numerical errors, e.g., discretization and truncation. There are further subcategories like dispersion and dissipation (the latter is the sort of “model” MILES provides for turbulent dissipation). I don’t think round-off error usually is the dominant factor, but the other numerical errors can be, and this might often be the case in fluid flow simulations on more modest hardware.
Round-off error can accumulate to dominate the numerical error if you do things wrong. See figure 38.5 for a representative illustration of the total numerical error as a function of time step. If the time step becomes very small, total numerical error actually increases due to build-up of round-off error. As I said, this only happens if you do things wrong, but it can happen.
Yes, I understand all that, but this isn’t the issue. The issue is how much all the assorted calculation errors matter in comparison to the rest of the uncertainty in the model.
I don’t think we disagree too much. If I had to pick one, I’d agree with you that the rest of the uncertainty is likely larger in most cases, but I think you substantially underestimate how inaccurate these numerical methods can be. Many commercial computational fluid dynamics codes use quite bad numerical methods along with large grid cells and time steps, so it seems possible to me that those errors can exceed the uncertainties in the other parameters. I can think of one case in particular in my own work where the numerical errors likely exceed the other uncertainties.