Here are the graphs from Hippke (he or I should publish summary at some point, sorry).
I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute.
So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software.
But instead of asking “how well does hardware/software progress help you get to 1995 performance?” you could ask “how well does hardware/software progress get you to 2015 performance?” and on that metric it looks like software progress is way more important because you basically just can’t scale old algorithms up to modern performance.
The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader.
ETA: but note that the ratio would be much more extreme for Deep Blue, which is another reasonable analogy you might use.
Yeah, the nonlinearity means it’s hard to know what question to ask.
If we just eyeball the graph and say that the Elo is log(log(compute)) + time (I’m totally ignoring constants here), and we assume that compute = et so that conveniently log(compute)=t, thenddtElo=1t+1 . The first term is from compute and the second from software. And so our history is totally not scale-free! There’s some natural timescale set by t=1, before which chess progress was dominated by compute and after which chess progress will be (was?) dominated by software.
Though maybe I shouldn’t spend so much time guessing at the phenomenology of chess, and different problems will have different scaling behavior :P I think this is the case for text models and things like the Winograd schema challenges.
Here are the graphs from Hippke (he or I should publish summary at some point, sorry).
I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute.
So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software.
But instead of asking “how well does hardware/software progress help you get to 1995 performance?” you could ask “how well does hardware/software progress get you to 2015 performance?” and on that metric it looks like software progress is way more important because you basically just can’t scale old algorithms up to modern performance.
The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader.
ETA: but note that the ratio would be much more extreme for Deep Blue, which is another reasonable analogy you might use.
Yeah, the nonlinearity means it’s hard to know what question to ask.
If we just eyeball the graph and say that the Elo is log(log(compute)) + time (I’m totally ignoring constants here), and we assume that compute = et so that conveniently log(compute)=t, thenddtElo=1t+1 . The first term is from compute and the second from software. And so our history is totally not scale-free! There’s some natural timescale set by t=1, before which chess progress was dominated by compute and after which chess progress will be (was?) dominated by software.
Though maybe I shouldn’t spend so much time guessing at the phenomenology of chess, and different problems will have different scaling behavior :P I think this is the case for text models and things like the Winograd schema challenges.