Here is what happens when we align the start dates and plot the improvements relative to the time of the first run.
I am slightly nervous about using the first run as the reference, since early data in a category is quite unrealiable and basically reflects the time of the first person to thought to submit a run. But I think it should not create any problems.
Interestingly, plotting the relative improvement reveals some S-curve patterns, with phases of increasing returns followed by phases of diminishing returns.
I did not manage either to beat the baseline by extrapolating the relative improvement times. Interestingly, using a grid to count non-improvements as observations made the extrapolation worse, so this time the best fit was achieved with log linear regression over the last 8 weeks of data in each category.
As before, the code to replicate my analysis is available here.
Haven’t had time yet to include logistic models or do analysis of the derivative of the improvements—if you feel so inclined feel free to reuse my code to perform the analysis yourself and if you share them here we can comment on the results!
PS: there is a sentence missing an ending in your comment
Those are good suggestions!
Here is what happens when we align the start dates and plot the improvements relative to the time of the first run.
I am slightly nervous about using the first run as the reference, since early data in a category is quite unrealiable and basically reflects the time of the first person to thought to submit a run. But I think it should not create any problems.
Interestingly, plotting the relative improvement reveals some S-curve patterns, with phases of increasing returns followed by phases of diminishing returns.
I did not manage either to beat the baseline by extrapolating the relative improvement times. Interestingly, using a grid to count non-improvements as observations made the extrapolation worse, so this time the best fit was achieved with log linear regression over the last 8 weeks of data in each category.
As before, the code to replicate my analysis is available here.
Haven’t had time yet to include logistic models or do analysis of the derivative of the improvements—if you feel so inclined feel free to reuse my code to perform the analysis yourself and if you share them here we can comment on the results!
PS: there is a sentence missing an ending in your comment
Ah yes, the bottle glitch...