I’m quite surprised by how far out on the Elo vs compute curve we already are by a million nodes/move. Is this the main “target platform” for stockfish, or are people mostly trying to optimize the performance for significantly smaller node counts?
(I’m wondering whether such strong diminishing returns are fundamental to the domain, or whether people are putting the most work into optimizing performance down at more like 100kNodes/sec.)
In another comment you wrote “In between is the region with ~70 ELO; that’s where engines usually operate on present hardware with minutes of think time” which made sense to me, I’m just trying to square that with this graph.
Mhm, good point. I must admit that the “70 ELO per doubling” etc. is forum wisdom that is perhaps not the last word. A similar scaling experiment was done with Houdini 3 (2013) which dropped below 70 ELO per doubling when exceeding 4 MNodes/move. In my experiment, the drop is already around 1 MNode/move. So there is certainly an engine dependence.
From what I understand about the computer chess community:
Engines are optimized to win in the competitions, for reputation. There are competitions for many time controls, but most well respected are the CCC with games of 3 to 15 minutes, and TCEC which goes up to 90 minutes. So there is an incentive to tune the engines well into the many-MNodes/move regime.
On the other hand, most testing during engine development is done at blitz or even bullet level (30s for the whole game for Stockfish). You can’t just play thousands of long games after each code commit to test its effect. Instead, many faster games are played. That’s in the few MNodes/move regime. So there’s some incentive to perform well in that regime.
Below that, I think that performance is “just what is it”, and nobody optimizes for it. However, I think it would be valuable to ask a Stockfish developer about their view.
I’m quite surprised by how far out on the Elo vs compute curve we already are by a million nodes/move. Is this the main “target platform” for stockfish, or are people mostly trying to optimize the performance for significantly smaller node counts?
(I’m wondering whether such strong diminishing returns are fundamental to the domain, or whether people are putting the most work into optimizing performance down at more like 100kNodes/sec.)
In another comment you wrote “In between is the region with ~70 ELO; that’s where engines usually operate on present hardware with minutes of think time” which made sense to me, I’m just trying to square that with this graph.
Mhm, good point. I must admit that the “70 ELO per doubling” etc. is forum wisdom that is perhaps not the last word. A similar scaling experiment was done with Houdini 3 (2013) which dropped below 70 ELO per doubling when exceeding 4 MNodes/move. In my experiment, the drop is already around 1 MNode/move. So there is certainly an engine dependence.
OK, I have added the Houdini data from this experiment to the plot:
The baseline ELO is not stated, but likely close to 3200:
The results look quite different for Houdini 3 vs SF8---is this just a matter of Stockfish being much better optimized for small amounts of hardware?
From what I understand about the computer chess community:
Engines are optimized to win in the competitions, for reputation. There are competitions for many time controls, but most well respected are the CCC with games of 3 to 15 minutes, and TCEC which goes up to 90 minutes. So there is an incentive to tune the engines well into the many-MNodes/move regime.
On the other hand, most testing during engine development is done at blitz or even bullet level (30s for the whole game for Stockfish). You can’t just play thousands of long games after each code commit to test its effect. Instead, many faster games are played. That’s in the few MNodes/move regime. So there’s some incentive to perform well in that regime.
Below that, I think that performance is “just what is it”, and nobody optimizes for it. However, I think it would be valuable to ask a Stockfish developer about their view.