I think there is a crucial difference between performance, as defined in the paper, and ability which should be taken very much into account. I will not debate if their definition of performance is consistent or not with the common usage, but they failed to state their definitions clearly and I think you misunderstood their results because of this.
The paper measures performance as the results of (roughly) zero-sum competitions. This is very clear when they analyze athletes (number of wins), politicians (election wins, re-elections) and actors (awards). But this is also true for research, as writing an impactful paper means arriving at a novel result before competing teams or succeeding at explaining something where other have failed.
But, for a professional runner, winning 90% of races is not the same as being 90% faster. Indeed, a runner who is on average 5% faster will win most races (not all, as he will have off days where his speed goes down by more than 5%).
Tests such as PISA and grades try to measure ability, e.g. your math skill. That is analogous to a runner’s speed, not to how many races he wins. I believe this is very much Gaussian distributed, and the paper does not show anything to the contrary. Indeed it is very reasonable to believe that Gaussian distributed abilities result in Pareto distributed outcomes in competitive situations (it may be a provable result but I’m too lazy to do the math now). So, it’s pretty much appropriate to give grades on a Gaussian.
Now, we could debate if productivity comes mostly from exceptional performers in the real world, which might result in similar reform ideas. BTW, that’s something I mostly don’t believe but it’s a tenable position on a very complicated issue.
I think that’s very important to note, thank you! In fact, the two measures may be quite related—it’s believable that pairwise comparisons across a normal distribution along with some noise (most of these are small numbers of contests) can look a lot like a power law (without the asymtotic crazy-large values).
But really, the tie between education and ability or performance is pretty tenuous in the first place, so we shouldn’t take any policy recommendations from this mathematical curiosity.
Thanks for the insightful comment. I agree that the performance measures used tend toward zero-sum games. I don’t, however, think that research is an example of a (roughly) zero-sum game. Scientific breakthroughs to be made is not a limited resource in anywhere near the same sense as sports trophies is a limited resource. When we’re counting papers, we’re getting closer to zero-sum, but I still think it’s significantly positive-sum.
Leaving that aside, I still think we need more examples from positive-sum games. We could look at things like
jobs created by entrepreneurs;
wealth created by entrepeneurs;
salaries;
books sold by authors;
returns made by investors; and
records sold by artists.
My hunch is that these also follow a Paretian distribution, but I’m only about 70 percent sure of that. Hypothetically, if I was right, what would you think then?
Maybe zero-sum was not the right expression, because I think it is broader than strictly zero-sum games. I meant winner-takes-most situations, where the reward of the best performer is outsized with respect to the reward of the next-best. This does not necessarily mean that the game is strictly zero-sum. In many cases, it is just that the product you deliver is scalable, so everyone will just want the best product (of course, preferences may mean that the ranking is not the same for everyone).
I am also convinced that all the things you mentioned have a fat tail, even if they don’t follow strictly a Pareto distribution (probably books/records will be the most close to Pareto, salaries the most close to a Gaussian but with a fat tail on the right). But I think this does not reflect the distribution of quality/skill but the characteristics of the markets.
Example: book sales. I like fantasy books, but the number of books I read per year is capped. So there are a few authors I follow, plus maybe once per year I look for reviews and check if some good book by other authors has come out. If a certain book I would read is not released, chances are I would read the next best one, and find that in fact it is not much worse. Of course, books of much better/worse quality would convince me to read more/less, but in practice the quality delivered by different authors is close enough that this is a relatively small effect. If everyone had the same taste in books, and everyone read 10 books per year, we would all be reading the same 10. If an outstanding book came out, book number 10 would pass from one billion sales to zero. Of course, this is way oversimplified: we have different tastes, and the interaction of objective quality with subjective tastes, plus other factors, creates a Pareto-like distribution of sales.
Example 2: tech companies. In most western countries, Google has a market share which is 10x Bing. It’s not that Google is 10x better than Bing. If people used Bing, they would maybe waste 10% extra time to get to the result they want. But that’s fairly consistent across different people. So Google is like a runner which is 10% faster and wins 90% of races. This is not true for all companies, but for most of the largest ones rely on mechanisms which create winner-takes-most situations (IP, brand recognition, network effects, economies of scale). That’s why you have a fat tail in wealth created by entrepreneurs (IMHO).
To go back to research. Scientific breakthroughs are not a limited resource, it’s true. But given the area of expertise of a researcher and the state of the art in the field, the most promising research topics are limited. And there are many researchers going into those topics. The first to find even a partial solution will easily get published on a fast track. The others will get published but much extra work will be required: compare with previous results, fight referees which favor other approaches, show extra rigor in the analysis… All this will lower their apparent productivity. Or, if you are not confident, you can take a less promising topic: you have less risk but your expected productivity goes down anyway. To this, add that better researchers get access to better complements: more funding, more and better collaborators, maybe less teaching responsibilities if you are in academia. All this widens the productivity gap between the best and the not-so-worse. Funding is particularly perverse because it’s partially awarded on past results without dividing by money spent to obtain them, so good/lucky researchers enter into a cycle of more results → more funding → even more results → even more funding …
In general, I think fat tails in outcomes are present everywhere because they come out naturally from the interaction of incentive structures (e.g. markets, IP, funding), economies of scale, and network effects. But they don’t need to reflect an underlying distribution of abilities. I obviously cannot prove that they never do, but I my standard assumption is that they don’t. (You could say that I have a prior that ability is distributed in a Gaussian way given that as far as I know all human characteristics that are directly measurable on an absolute scale look more Gaussian-like than Pareto-like)
Thanks a ton. That is very helpful. I think I understand your point now. (Others in the comments have also said something similar, but I didn’t grasp it until now.)
Let me try to work through it in my own words and apply your insight to my question:
Education contributes to people’s abilities — at least, that’s the idea. It also certifies them. Ability is roughly Gaussian, so tests and teaching should assume that. Which they currently do.
Results, however, depend on many other (possibly overlapping) things, such as
luck;
market structure;
intellectual property rights;
economies of scale;
branding; and
network effects.
For education policy, Pareto results don’t matter. Schools can only affect the input, not the output.
I still think my reform suggestion is good. But I am no longer convinced that Pareto performance implies anything for education education reform. Unless, of course, it turns out that ability does follow a Pareto distribution. But that seems unlikely to me.
I think there is a crucial difference between performance, as defined in the paper, and ability which should be taken very much into account. I will not debate if their definition of performance is consistent or not with the common usage, but they failed to state their definitions clearly and I think you misunderstood their results because of this.
The paper measures performance as the results of (roughly) zero-sum competitions. This is very clear when they analyze athletes (number of wins), politicians (election wins, re-elections) and actors (awards). But this is also true for research, as writing an impactful paper means arriving at a novel result before competing teams or succeeding at explaining something where other have failed.
But, for a professional runner, winning 90% of races is not the same as being 90% faster. Indeed, a runner who is on average 5% faster will win most races (not all, as he will have off days where his speed goes down by more than 5%).
Tests such as PISA and grades try to measure ability, e.g. your math skill. That is analogous to a runner’s speed, not to how many races he wins. I believe this is very much Gaussian distributed, and the paper does not show anything to the contrary. Indeed it is very reasonable to believe that Gaussian distributed abilities result in Pareto distributed outcomes in competitive situations (it may be a provable result but I’m too lazy to do the math now). So, it’s pretty much appropriate to give grades on a Gaussian.
Now, we could debate if productivity comes mostly from exceptional performers in the real world, which might result in similar reform ideas. BTW, that’s something I mostly don’t believe but it’s a tenable position on a very complicated issue.
I think that’s very important to note, thank you! In fact, the two measures may be quite related—it’s believable that pairwise comparisons across a normal distribution along with some noise (most of these are small numbers of contests) can look a lot like a power law (without the asymtotic crazy-large values).
But really, the tie between education and ability or performance is pretty tenuous in the first place, so we shouldn’t take any policy recommendations from this mathematical curiosity.
Thanks for the insightful comment. I agree that the performance measures used tend toward zero-sum games. I don’t, however, think that research is an example of a (roughly) zero-sum game. Scientific breakthroughs to be made is not a limited resource in anywhere near the same sense as sports trophies is a limited resource. When we’re counting papers, we’re getting closer to zero-sum, but I still think it’s significantly positive-sum.
Leaving that aside, I still think we need more examples from positive-sum games. We could look at things like
jobs created by entrepreneurs;
wealth created by entrepeneurs;
salaries;
books sold by authors;
returns made by investors; and
records sold by artists.
My hunch is that these also follow a Paretian distribution, but I’m only about 70 percent sure of that. Hypothetically, if I was right, what would you think then?
Maybe zero-sum was not the right expression, because I think it is broader than strictly zero-sum games. I meant winner-takes-most situations, where the reward of the best performer is outsized with respect to the reward of the next-best. This does not necessarily mean that the game is strictly zero-sum. In many cases, it is just that the product you deliver is scalable, so everyone will just want the best product (of course, preferences may mean that the ranking is not the same for everyone).
I am also convinced that all the things you mentioned have a fat tail, even if they don’t follow strictly a Pareto distribution (probably books/records will be the most close to Pareto, salaries the most close to a Gaussian but with a fat tail on the right). But I think this does not reflect the distribution of quality/skill but the characteristics of the markets.
Example: book sales. I like fantasy books, but the number of books I read per year is capped. So there are a few authors I follow, plus maybe once per year I look for reviews and check if some good book by other authors has come out. If a certain book I would read is not released, chances are I would read the next best one, and find that in fact it is not much worse. Of course, books of much better/worse quality would convince me to read more/less, but in practice the quality delivered by different authors is close enough that this is a relatively small effect. If everyone had the same taste in books, and everyone read 10 books per year, we would all be reading the same 10. If an outstanding book came out, book number 10 would pass from one billion sales to zero. Of course, this is way oversimplified: we have different tastes, and the interaction of objective quality with subjective tastes, plus other factors, creates a Pareto-like distribution of sales.
Example 2: tech companies. In most western countries, Google has a market share which is 10x Bing. It’s not that Google is 10x better than Bing. If people used Bing, they would maybe waste 10% extra time to get to the result they want. But that’s fairly consistent across different people. So Google is like a runner which is 10% faster and wins 90% of races. This is not true for all companies, but for most of the largest ones rely on mechanisms which create winner-takes-most situations (IP, brand recognition, network effects, economies of scale). That’s why you have a fat tail in wealth created by entrepreneurs (IMHO).
To go back to research. Scientific breakthroughs are not a limited resource, it’s true. But given the area of expertise of a researcher and the state of the art in the field, the most promising research topics are limited. And there are many researchers going into those topics. The first to find even a partial solution will easily get published on a fast track. The others will get published but much extra work will be required: compare with previous results, fight referees which favor other approaches, show extra rigor in the analysis… All this will lower their apparent productivity. Or, if you are not confident, you can take a less promising topic: you have less risk but your expected productivity goes down anyway. To this, add that better researchers get access to better complements: more funding, more and better collaborators, maybe less teaching responsibilities if you are in academia. All this widens the productivity gap between the best and the not-so-worse. Funding is particularly perverse because it’s partially awarded on past results without dividing by money spent to obtain them, so good/lucky researchers enter into a cycle of more results → more funding → even more results → even more funding …
In general, I think fat tails in outcomes are present everywhere because they come out naturally from the interaction of incentive structures (e.g. markets, IP, funding), economies of scale, and network effects. But they don’t need to reflect an underlying distribution of abilities. I obviously cannot prove that they never do, but I my standard assumption is that they don’t. (You could say that I have a prior that ability is distributed in a Gaussian way given that as far as I know all human characteristics that are directly measurable on an absolute scale look more Gaussian-like than Pareto-like)
Thanks a ton. That is very helpful. I think I understand your point now. (Others in the comments have also said something similar, but I didn’t grasp it until now.)
Let me try to work through it in my own words and apply your insight to my question:
Education contributes to people’s abilities — at least, that’s the idea. It also certifies them. Ability is roughly Gaussian, so tests and teaching should assume that. Which they currently do.
Results, however, depend on many other (possibly overlapping) things, such as
luck;
market structure;
intellectual property rights;
economies of scale;
branding; and
network effects.
For education policy, Pareto results don’t matter. Schools can only affect the input, not the output.
I still think my reform suggestion is good. But I am no longer convinced that Pareto performance implies anything for education education reform. Unless, of course, it turns out that ability does follow a Pareto distribution. But that seems unlikely to me.