>Glancing back and forth, I keep changing my mind about whether or not I think the messy empirical data is close enough to the prediction from the normal distribution to accept your conclusion, or whether that elbow feature around 1976-80 seems compelling.
I realize you two had a long discussion about this, but my few cents: This kind of situation (eyeballing is not enough to resolve which of two models fit the data better) is exactly the kind of situation for which a concept of statistical inference is very useful.
I’m a bit too busy right now to present a computation, but my first idea would be to gather the data and run a simple “bootstrappy” simulation: 1) Get the original data set. 2) Generate k = 1 … N simulated samples x^k = [x^k_1, … x^k_t] form a normal distribution with linearly increasing mean mu(t) = mu + c * t at time points t= 1960 … 2018, where c and variance are as in “linear increase hypothesis”. 3) How many of simulated replicate time series have an elbow at 1980 that is equally or more extreme than observed in the data? (One could do this not too informal way by fitting a piece-wise regression model with break at t = 2018 to reach replicate time series, and computing if the two slope estimates differ by a predetermined threshold, such as the estimates recovered by fitting the same piece-wise model in the real data).
This is slightly ad hoc, and there are probably fancier statistical methods for this kind of test, or you could fits some kind of Bayesian model, but I’d think such computational exercise would be illustrative.
I think this is a great idea—I’m also too busy to do this right now and not equipped with that skillset, but I would read your work with interest if you chose to carry this out.
>Glancing back and forth, I keep changing my mind about whether or not I think the messy empirical data is close enough to the prediction from the normal distribution to accept your conclusion, or whether that elbow feature around 1976-80 seems compelling.
I realize you two had a long discussion about this, but my few cents: This kind of situation (eyeballing is not enough to resolve which of two models fit the data better) is exactly the kind of situation for which a concept of statistical inference is very useful.
I’m a bit too busy right now to present a computation, but my first idea would be to gather the data and run a simple “bootstrappy” simulation: 1) Get the original data set. 2) Generate k = 1 … N simulated samples x^k = [x^k_1, … x^k_t] form a normal distribution with linearly increasing mean mu(t) = mu + c * t at time points t= 1960 … 2018, where c and variance are as in “linear increase hypothesis”. 3) How many of simulated replicate time series have an elbow at 1980 that is equally or more extreme than observed in the data? (One could do this not too informal way by fitting a piece-wise regression model with break at t = 2018 to reach replicate time series, and computing if the two slope estimates differ by a predetermined threshold, such as the estimates recovered by fitting the same piece-wise model in the real data).
This is slightly ad hoc, and there are probably fancier statistical methods for this kind of test, or you could fits some kind of Bayesian model, but I’d think such computational exercise would be illustrative.
I think this is a great idea—I’m also too busy to do this right now and not equipped with that skillset, but I would read your work with interest if you chose to carry this out.