BTW, is ELO supposed to have that kind of linear interpretation?
It seems that whether or not it’s supposed to, in practice it does. From the just released “Intrinsic Chess Ratings”, which takes Rybka and does exhaustive evaluations (deep enough to be ‘relatively omniscient’) of many thousands of modern chess games; on page 9:
We conclude that there is a smooth relationship between the actual players’ Elo ratings and the intrinsic quality of the move choices as measured by the chess program and the agent fitting. Moreover, the final s-fit values obtained are nearly the same for the corresponding entries of all three time periods. Since a lower s indicates higher skill, we conclude that there has been little or no ‘inflation’ in ratings over time—if anything there has been deflation. This runs counter to conventional wisdom, but is predicted by population models on which rating systems have been based [Gli99].
The results also support a no answer to question 2 [“Were the top players of earlier times as strong as the top players of today?”]. In the 1970’s there were only two players with ratings over 2700, namely Bobby Fischer and Anatoly Karpov, and there were years as late as 1981 when no one had a rating over 2700 (see [Wee00]). In the past decade there have usually been thirty or more players with such ratings. Thus lack of inflation implies that those players are better than all but Fischer and Karpov were. Extrapolated backwards, this would be consistent with the findings of [DHMG07], which however (like some recent competitions to improve on the Elo system) are based only on the results of games, not on intrinsic decision-making.
It seems that whether or not it’s supposed to, in practice it does. From the just released “Intrinsic Chess Ratings”, which takes Rybka and does exhaustive evaluations (deep enough to be ‘relatively omniscient’) of many thousands of modern chess games; on page 9: