Differences and commonalities to expect between speedrunning and technological improvement in different fields.
Is there any way to estimate how many cumulative games that speedrunners have run at a given point? It is intuitive that progress should be related to amount of effort put into it, and that the more people play a game, the further they can push the limits, which may explain a lot of the apparent heterogeneity, even if all games have a similar experience curve exponent.
Is there any way to estimate how many cumulative games that speedrunners have run at a given point?
One should be able to use the Speedrun.com API to search for the number of runs submitted by a certain date, as a proxy for the cumulative games (though it will not reflect all attempts since AFAIK many runners only submit their personal bests to speedrun.com).
These are aggregated by game, not by category, so one would need to somehow split the runs among popular categories of the same game.
There is only current data avaiable through the webpage. There might be a way to access historical data through the API. If not, one would need to use archived versions of the pages and interpolate the scrapped stats.
I’d be excited about learning about the results of either approach if anybody ends up scrapping this data!
Is there any way to estimate how many cumulative games that speedrunners have run at a given point? It is intuitive that progress should be related to amount of effort put into it, and that the more people play a game, the further they can push the limits, which may explain a lot of the apparent heterogeneity, even if all games have a similar experience curve exponent.
It’s also interesting because the form might suggest that each attempt has an equal chance of setting a record (equal-odds rule; “On the distribution of time-to-proof of mathematical conjectures”, Hisano & Sornette 2012 for math proof attempts; counting-argument in “Scaling Scaling Laws with Board Games”, Jones 2021), which shows how progress comes from brute force thinking.
Also relevant: https://www.authorea.com/users/429500/articles/533177-modelling-a-time-series-of-records-in-pymc3 https://gwern.net/doc/statistics/order/index#smith-1988-section
We actually wrote a more up to date paper here
https://arxiv.org/abs/2304.10004
One should be able to use the Speedrun.com API to search for the number of runs submitted by a certain date, as a proxy for the cumulative games (though it will not reflect all attempts since AFAIK many runners only submit their personal bests to speedrun.com).
Additionally, speedrun.com provides some stats on the amount of runs and players for each game, for example the current stats for Super Metroid can be found here: https://www.speedrun.com/supermetroid/gamestats
There are some problems with this approach too.
These are aggregated by game, not by category, so one would need to somehow split the runs among popular categories of the same game.
There is only current data avaiable through the webpage. There might be a way to access historical data through the API. If not, one would need to use archived versions of the pages and interpolate the scrapped stats.
I’d be excited about learning about the results of either approach if anybody ends up scrapping this data!