I am thinking of this as a noise-reducing modification to the loss function, similar to using model-based rather than model-free learning (which, if done well, rewards/punishes a policy based on the average reward/punishment it would have gotten over many steps).
If science were incentivized via prediction market (and assuming scientists can make sizable bets by taking out loans), then the first person to predict a thing wins most of the money related to it. In other words, prediction markets are approximately parade-leader-incentivizing.
But if there’s a race to be the first to bet, then this reward is high-variance; Newton could get priority over Leibniz by getting his ideas to the market a little faster.
You recommend dividing credit more to all the people who could have gotten information to the market, with some kind of time-discount for when they could have done it. If we conceive of “who won the race” as introducing some noise into the credit-assignment, this is a way to de-noise things.
This has the consequence of taking away a lot of credit from race-winners when the race was pretty big, which is the part you focus on; based on this idea, you want to be part of smaller races (ideally size 1). But, outside-view, you should have wanted this all along anyway; if you are racing for status, but you are part of a big race, only a small number of people can win anyway, so your outside-view probability of personally winning status should already be divided by the number of racers. To think you have a good chance of winning such a race you must have personal reasons, and (since being in the race selects, in part, for people who think they can win) they’re probably overconfident.
So for the most part your advice has no benefit for calibrated people, since being a parade-leader is hard.
There are for sure cases where your metric comes apart from expected-parade-leading by a lot more, though. A few years ago I heard accusations that one of the big names behind Deep Learning earned their status by visiting lots of research groups and keeping an eye out for what big things were going to happen next, and managing to publish papers on these big things just a bit ahead of everyone else. This strategy creates the appearance of being a fountain of information, when in fact the service provided is just a small speed boost to pre-existing trends. (I do not recall who exactly was being accused, and I don’t have a lot of info on the reliability of this assessment anyway, it was just a rumor.)
Here’s an example use-case where it matters a lot.
Suppose I’m trying to do a typical Progress Studies thing; I look at a bunch of historical examples of major discoveries and inventions, in hopes of finding useful patterns to emulate. For purposes of my data-gathering, I want to do the sort of credit assignment the post talks about: I want to filter out the parade-leaders, because I expect they’re mostly dominated by noise.
I do think you’re mostly-right for purposes of a researcher rationally optimizing for their own status. But that’s a quite different use-case from an external observer trying to understand what factors drive progress, or even a researcher trying to understand key factors in order to boost their own work.
The counterfactual impact of a researcher is analogous to the insight that professional baseball players are largely interchangeable because they are all already selected from the extreme tail of baseball playing ability, which is to say the counterfactual impact of a given player added to the team is also low.
Of course in Moneyball they used this to get good-enough talent within budget, which is not the same as the researcher case. All of fantasy sports is exactly a giant counterfactual exercise; I wonder how far we could get with ‘fantasy labs’ or something.
One way to identify counterfactually-excellent researchers would be to compare the magnitude of their “greatest achievement” and secondary discoveries, because the credit that parade leaders get is often useful for propagating their future success and the people who do more with that boost are the ones who should be given extra credit for originality (their idea) as opposed to novelty (their idea first). Newton and Leibniz both had remarkably successful and diverse achievements, which suggests that they were relatively high in counterfactual impact in most (if not all) of those fields. Another approach would consider how many people or approaches to a problem had tried and failed to solve it: crediting the zeitgeist rather than Newton and/or Leibniz specifically seems to miss a critical question, namely that if neither of them solved it, would it have taken an additional year, or more like 10 to 50? In their case, we have a proxy to an answer: ideas took months or years to spread at all beyond the “centers of discovery” at the time, and so although they clearly took only a few months or years to compete for the prize of first (and a few decades to argue over it), we can relatively safely conjecture that whichever anonymous contender is third in the running is likely to have been behind on at least that timescale. That should be considered in contrast to Andrew Wiles, whose proof of Rermat’s Last Theorem was efficiently and immediately published (and patched as needed) This is also important because other and in particular later luminaries of the field (e.g. Mengoli, Mercator, various Bernoullis, Euler, etc.) might not have had the vocabulary necessary to make as many discoveries as quickly as they did or communicate those discoveries as effectively if not for Newton & Leibniz’s timely contributions.
I am thinking of this as a noise-reducing modification to the loss function, similar to using model-based rather than model-free learning (which, if done well, rewards/punishes a policy based on the average reward/punishment it would have gotten over many steps).
If science were incentivized via prediction market (and assuming scientists can make sizable bets by taking out loans), then the first person to predict a thing wins most of the money related to it. In other words, prediction markets are approximately parade-leader-incentivizing.
But if there’s a race to be the first to bet, then this reward is high-variance; Newton could get priority over Leibniz by getting his ideas to the market a little faster.
You recommend dividing credit more to all the people who could have gotten information to the market, with some kind of time-discount for when they could have done it. If we conceive of “who won the race” as introducing some noise into the credit-assignment, this is a way to de-noise things.
This has the consequence of taking away a lot of credit from race-winners when the race was pretty big, which is the part you focus on; based on this idea, you want to be part of smaller races (ideally size 1). But, outside-view, you should have wanted this all along anyway; if you are racing for status, but you are part of a big race, only a small number of people can win anyway, so your outside-view probability of personally winning status should already be divided by the number of racers. To think you have a good chance of winning such a race you must have personal reasons, and (since being in the race selects, in part, for people who think they can win) they’re probably overconfident.
So for the most part your advice has no benefit for calibrated people, since being a parade-leader is hard.
There are for sure cases where your metric comes apart from expected-parade-leading by a lot more, though. A few years ago I heard accusations that one of the big names behind Deep Learning earned their status by visiting lots of research groups and keeping an eye out for what big things were going to happen next, and managing to publish papers on these big things just a bit ahead of everyone else. This strategy creates the appearance of being a fountain of information, when in fact the service provided is just a small speed boost to pre-existing trends. (I do not recall who exactly was being accused, and I don’t have a lot of info on the reliability of this assessment anyway, it was just a rumor.)
Here’s an example use-case where it matters a lot.
Suppose I’m trying to do a typical Progress Studies thing; I look at a bunch of historical examples of major discoveries and inventions, in hopes of finding useful patterns to emulate. For purposes of my data-gathering, I want to do the sort of credit assignment the post talks about: I want to filter out the parade-leaders, because I expect they’re mostly dominated by noise.
I do think you’re mostly-right for purposes of a researcher rationally optimizing for their own status. But that’s a quite different use-case from an external observer trying to understand what factors drive progress, or even a researcher trying to understand key factors in order to boost their own work.
A sports analogy is Moneyball.
The counterfactual impact of a researcher is analogous to the insight that professional baseball players are largely interchangeable because they are all already selected from the extreme tail of baseball playing ability, which is to say the counterfactual impact of a given player added to the team is also low.
Of course in Moneyball they used this to get good-enough talent within budget, which is not the same as the researcher case. All of fantasy sports is exactly a giant counterfactual exercise; I wonder how far we could get with ‘fantasy labs’ or something.
One way to identify counterfactually-excellent researchers would be to compare the magnitude of their “greatest achievement” and secondary discoveries, because the credit that parade leaders get is often useful for propagating their future success and the people who do more with that boost are the ones who should be given extra credit for originality (their idea) as opposed to novelty (their idea first). Newton and Leibniz both had remarkably successful and diverse achievements, which suggests that they were relatively high in counterfactual impact in most (if not all) of those fields. Another approach would consider how many people or approaches to a problem had tried and failed to solve it: crediting the zeitgeist rather than Newton and/or Leibniz specifically seems to miss a critical question, namely that if neither of them solved it, would it have taken an additional year, or more like 10 to 50? In their case, we have a proxy to an answer: ideas took months or years to spread at all beyond the “centers of discovery” at the time, and so although they clearly took only a few months or years to compete for the prize of first (and a few decades to argue over it), we can relatively safely conjecture that whichever anonymous contender is third in the running is likely to have been behind on at least that timescale. That should be considered in contrast to Andrew Wiles, whose proof of Rermat’s Last Theorem was efficiently and immediately published (and patched as needed) This is also important because other and in particular later luminaries of the field (e.g. Mengoli, Mercator, various Bernoullis, Euler, etc.) might not have had the vocabulary necessary to make as many discoveries as quickly as they did or communicate those discoveries as effectively if not for Newton & Leibniz’s timely contributions.