You have run into the “productivity paradox.” This is the problem that, while it seems from first-hand observation that using computers would raise productivity, that rising productivity does not seem to show up in economy-wide statistics. It is something of a mystery. The Wikipedia page on the subject has an OK introduction to the problem.
I’d suggest that the key task is not measuring the productivity of the computers. The task is measuring the change in productivity of the researcher. For that, you must have a measure of research output. You’d probably need multiple proxies, since you can’t evaluate it directly. For example, one proxy might be “words of published AI articles in peer-reviewed journals.” A problem with this particular proxy is substitution, over long time periods, of self-publication (on the web) for journal publication.
A bigger problem is the quality problem. The quality of a good today is far better than the similar good of 30 years ago. But how much? There’s no way to quantify it. Economists usually use some sense that “this year must be really close to last year, so we’ll ignore it across small time frames.” But that does not help for long time frames (unless you are looking only at the rate of change in productivity rates, such that the productivity rate itself gets swept aside by taking the first derivative, which works fine as long as quality is nor changing disproportionately to productivity). The problem seems much greater if you have to assess the quality of AI research. Perhaps you could construct some kind of complementary metric for each proxy you use, such as “citations in peer-reviewed journals” for each peer-reviewed article you used in the proxy noted above. And you would again have to address the effect of self-publication, this time on quality.
You have run into the “productivity paradox.” This is the problem that, while it seems from first-hand observation that using computers would raise productivity, that rising productivity does not seem to show up in economy-wide statistics. It is something of a mystery. The Wikipedia page on the subject has an OK introduction to the problem.
I’d suggest that the key task is not measuring the productivity of the computers. The task is measuring the change in productivity of the researcher. For that, you must have a measure of research output. You’d probably need multiple proxies, since you can’t evaluate it directly. For example, one proxy might be “words of published AI articles in peer-reviewed journals.” A problem with this particular proxy is substitution, over long time periods, of self-publication (on the web) for journal publication.
A bigger problem is the quality problem. The quality of a good today is far better than the similar good of 30 years ago. But how much? There’s no way to quantify it. Economists usually use some sense that “this year must be really close to last year, so we’ll ignore it across small time frames.” But that does not help for long time frames (unless you are looking only at the rate of change in productivity rates, such that the productivity rate itself gets swept aside by taking the first derivative, which works fine as long as quality is nor changing disproportionately to productivity). The problem seems much greater if you have to assess the quality of AI research. Perhaps you could construct some kind of complementary metric for each proxy you use, such as “citations in peer-reviewed journals” for each peer-reviewed article you used in the proxy noted above. And you would again have to address the effect of self-publication, this time on quality.