This is cool! I like speedrunning! There’s definitely a connection between speed-running and AI optimization/misalignment (see When Bots Teach Themselves to Cheat, for example). Some specific suggestions:
Speedrun times have a defined lower bound on the minimization problem (zero seconds). So over an infinite amount of time, the time vs speedrun time plot necessarily converges to a flat line. You can avoid this by converting to an unbounded maximization problem. For example, you might wanna try plotting Speed-Run-Time-on-Game-Release divided by Speed-Run-Time at time t vs time. Some benefits of this include
Intuitive meaning: This ratio tells you how many optimal speed-runs at time t could be accomplished over the course of a single speed-run at game release
Partially addresses diminishing returns: Say the game’s first speed-run completes the game in 60 seconds and the second speed-run is completes the game at 15 seconds (a 45 second improvement). No matter how much you work at the game, its not possible to reduce the speed-run time by more than the 45 second improvement (at most you can do 15 seconds) so diminishing returns are implied
In contrast, if you look at the ratio, the first speed has a ratio of 1 (60 seconds/60 seconds), the second has a ratio of 4 (60 seconds/15 seconds), and a third one-second speed run has a ratio of 60 (60 seconds/1 second). Between the second and third speed-run, we’ve gone from a value of 4 to a value of 60 (a 15x increase!). Diminishing returns are no longer inevitable!
Easier to visualize: By normalizing by the initial speed-run time, all games start out with the same value regardless of how long they objectively take. This will allow you to more easily identify similarities between the trends.
More comparable to tech progress: Since diminishing returns aren’t inevitable by construction, this looks more like tech progress where diminishing returns also aren’t inevitable by construction. Note that they still can be in practice however
Instead of plotting absolute dates, you plot time relative to when the first speed-run was registered. That is, set the date of the first speed run to t=0. This should help you identify trends.
A lot of the games you review indicate that, in many cases, our best speed-run time so far isn’t even 3x as faster as the original speed-run. This implies that optimizing speed-run time (or the ratio I introduce above) is bounded and you can’t get more than a factor of 3 or 4 in terms of improvement. But obviously tech capabilities have improved by several orders of magnitude. So structurally, I don’t think speed-running can be particularly predictive of the tech advances
Given the above, I suggest that if you want to model speed-runs, you should usefunctions that expect asymptotes (eg logistic equations). Combinations of logistic equations can probably capture the cascading L curves you notice in your write-up. May also be worth doing some basic analysis like counting the number of inflections in each speed-run (do this by plotting derivatives and counting the number of peaks).
If you do this, I strongly suggest doing a transformation like the one I suggested above since otherwise, you’re probably gonna get diminishing returns right off the bat and logistic equations don’t expect this. If you don’t transform for whatever reason, try exponential decay.
Speed-running world records have times that, by definition, must monotonically decrease. So its expected that most of the plots will look like continuous functions. As you’re plotting things now, diminishing returns are built-in so you should also expect the derivatives to
Here is what happens when we align the start dates and plot the improvements relative to the time of the first run.
I am slightly nervous about using the first run as the reference, since early data in a category is quite unrealiable and basically reflects the time of the first person to thought to submit a run. But I think it should not create any problems.
Interestingly, plotting the relative improvement reveals some S-curve patterns, with phases of increasing returns followed by phases of diminishing returns.
I did not manage either to beat the baseline by extrapolating the relative improvement times. Interestingly, using a grid to count non-improvements as observations made the extrapolation worse, so this time the best fit was achieved with log linear regression over the last 8 weeks of data in each category.
As before, the code to replicate my analysis is available here.
Haven’t had time yet to include logistic models or do analysis of the derivative of the improvements—if you feel so inclined feel free to reuse my code to perform the analysis yourself and if you share them here we can comment on the results!
PS: there is a sentence missing an ending in your comment
This is cool! I like speedrunning! There’s definitely a connection between speed-running and AI optimization/misalignment (see When Bots Teach Themselves to Cheat, for example). Some specific suggestions:
Speedrun times have a defined lower bound on the minimization problem (zero seconds). So over an infinite amount of time, the time vs speedrun time plot necessarily converges to a flat line. You can avoid this by converting to an unbounded maximization problem. For example, you might wanna try plotting Speed-Run-Time-on-Game-Release divided by Speed-Run-Time at time t vs time. Some benefits of this include
Intuitive meaning: This ratio tells you how many optimal speed-runs at time t could be accomplished over the course of a single speed-run at game release
Partially addresses diminishing returns: Say the game’s first speed-run completes the game in 60 seconds and the second speed-run is completes the game at 15 seconds (a 45 second improvement). No matter how much you work at the game, its not possible to reduce the speed-run time by more than the 45 second improvement (at most you can do 15 seconds) so diminishing returns are implied
In contrast, if you look at the ratio, the first speed has a ratio of 1 (60 seconds/60 seconds), the second has a ratio of 4 (60 seconds/15 seconds), and a third one-second speed run has a ratio of 60 (60 seconds/1 second). Between the second and third speed-run, we’ve gone from a value of 4 to a value of 60 (a 15x increase!). Diminishing returns are no longer inevitable!
Easier to visualize: By normalizing by the initial speed-run time, all games start out with the same value regardless of how long they objectively take. This will allow you to more easily identify similarities between the trends.
More comparable to tech progress: Since diminishing returns aren’t inevitable by construction, this looks more like tech progress where diminishing returns also aren’t inevitable by construction. Note that they still can be in practice however
Instead of plotting absolute dates, you plot time relative to when the first speed-run was registered. That is, set the date of the first speed run to t=0. This should help you identify trends.
A lot of the games you review indicate that, in many cases, our best speed-run time so far isn’t even 3x as faster as the original speed-run. This implies that optimizing speed-run time (or the ratio I introduce above) is bounded and you can’t get more than a factor of 3 or 4 in terms of improvement. But obviously tech capabilities have improved by several orders of magnitude. So structurally, I don’t think speed-running can be particularly predictive of the tech advances
Given the above, I suggest that if you want to model speed-runs, you should use functions that expect asymptotes (eg logistic equations). Combinations of logistic equations can probably capture the cascading L curves you notice in your write-up. May also be worth doing some basic analysis like counting the number of inflections in each speed-run (do this by plotting derivatives and counting the number of peaks).
If you do this, I strongly suggest doing a transformation like the one I suggested above since otherwise, you’re probably gonna get diminishing returns right off the bat and logistic equations don’t expect this. If you don’t transform for whatever reason, try exponential decay.
Speed-running world records have times that, by definition, must monotonically decrease. So its expected that most of the plots will look like continuous functions. As you’re plotting things now, diminishing returns are built-in so you should also expect the derivatives to
Have fun out there!
Those are good suggestions!
Here is what happens when we align the start dates and plot the improvements relative to the time of the first run.
I am slightly nervous about using the first run as the reference, since early data in a category is quite unrealiable and basically reflects the time of the first person to thought to submit a run. But I think it should not create any problems.
Interestingly, plotting the relative improvement reveals some S-curve patterns, with phases of increasing returns followed by phases of diminishing returns.
I did not manage either to beat the baseline by extrapolating the relative improvement times. Interestingly, using a grid to count non-improvements as observations made the extrapolation worse, so this time the best fit was achieved with log linear regression over the last 8 weeks of data in each category.
As before, the code to replicate my analysis is available here.
Haven’t had time yet to include logistic models or do analysis of the derivative of the improvements—if you feel so inclined feel free to reuse my code to perform the analysis yourself and if you share them here we can comment on the results!
PS: there is a sentence missing an ending in your comment
Ah yes, the bottle glitch...