paulfchristiano comments on A closer look at chess scalings (into the past)

paulfchristiano 15 Jul 2021 17:24 UTC

4 points

I’m quite surprised by how far out on the Elo vs compute curve we already are by a million nodes/move. Is this the main “target platform” for stockfish, or are people mostly trying to optimize the performance for significantly smaller node counts?

(I’m wondering whether such strong diminishing returns are fundamental to the domain, or whether people are putting the most work into optimizing performance down at more like 100kNodes/sec.)

paulfchristiano 15 Jul 2021 20:13 UTC

2 points

Parent

In another comment you wrote “In between is the region with ~70 ELO; that’s where engines usually operate on present hardware with minutes of think time” which made sense to me, I’m just trying to square that with this graph.

hippke 15 Jul 2021 20:25 UTC

1 point

Parent

Mhm, good point. I must admit that the “70 ELO per doubling” etc. is forum wisdom that is perhaps not the last word. A similar scaling experiment was done with Houdini 3 (2013) which dropped below 70 ELO per doubling when exceeding 4 MNodes/move. In my experiment, the drop is already around 1 MNode/move. So there is certainly an engine dependence.

hippke 15 Jul 2021 20:36 UTC

3 points

Parent

OK, I have added the Houdini data from this experiment to the plot:

The baseline ELO is not stated, but likely close to 3200:

Experiment	kNodes/move	ELO drop	ELO calculated
4k nodes vs 2k nodes	2	303	1280
8k nodes vs 4k nodes	4	280	1583
16k nodes vs 8k nodes	8	237	1863
32k nodes vs 16k nodes	16	208	2100
64k nodes vs 32k nodes	32	179	2308
128k nodes vs 64k nodes	64	156	2487
256k nodes vs 128k nodes	128	136	2643
512k nodes vs 256k nodes	256	134	2779
1024k nodes vs 512k nodes	512	115	2913
2048k nodes vs 1024k nodes	1024	93	3028
4096k nodes vs 2048k nodes	2048	79	3121
Baseline	4096		3200

paulfchristiano 23 Jul 2021 19:22 UTC
2 points
Parent
The results look quite different for Houdini 3 vs SF8---is this just a matter of Stockfish being much better optimized for small amounts of hardware?

hippke 15 Jul 2021 20:13 UTC
1 point
Parent
From what I understand about the computer chess community:
- Engines are optimized to win in the competitions, for reputation. There are competitions for many time controls, but most well respected are the CCC with games of 3 to 15 minutes, and TCEC which goes up to 90 minutes. So there is an incentive to tune the engines well into the many-MNodes/move regime.
- On the other hand, most testing during engine development is done at blitz or even bullet level (30s for the whole game for Stockfish). You can’t just play thousands of long games after each code commit to test its effect. Instead, many faster games are played. That’s in the few MNodes/move regime. So there’s some incentive to perform well in that regime.
- Below that, I think that performance is “just what is it”, and nobody optimizes for it. However, I think it would be valuable to ask a Stockfish developer about their view.