Comments on Power Law Distribution of Individual Impact

I had a discussion online yesterday, stemming from whether you should expect to be able to identify individuals who will most shape the long term future of humanity. It was on a discussion of whether CEA should have staff work on doing this full time, and I was expecting boring comments that just expressed a political opinion about what CEA should do. However, Jan Kulveit offered some concrete models for me to disagree with, and I had a fun exchange and appreciated the chance to make explicit some of my models in this area.

With permission of all involved, I have reproduced the exchange below.


Jan:

I would be also worried. Homophily is of the best predictors of links in social networks, and factors like being member of the same social group, having similar education, opinions, etc. are known to bias selection processes again toward selecting similar people. This risks having the core of the movement be more self encapsulated that it is, which is a shift in bad direction.

Also I would be worried with 80k hours shifting also more toward individual coaching, there is now a bit overemphasis on “individual” approach and too little on “creating systems”.

Also it seems lot of this would benefit from knowledge from the fields of “science of success”, general scientometry, network science, etc. E.g. when I read concepts like “next Peter Singer” or a lot of thinking along the line “most of the value is created by just a few peple”, I’m worried. While such thinking is intuitively appealing, it can be quite superficial. E.g., a toy model: Imagine a landscape with gold scattered in power-law sized deposits. And prospectors, walking randomly, and randomly discovering deposits of gold. What you observe is the value of gold collected by prospectors is also power-law distributed. But obviously the attempts to emulate “the best” or find the “next best” would be futile. It seems open question (worth studying) how much some specific knowledge landscape resembles this model, or how big part of the success is attributable to luck.


Ben (me):

That’s a nice toy model, thanks for being so clear :-)

But it’s definitely wrong. If you look at Bostrom on AI or Einstein on Relativity or Feynman on Quantum Mechanics, you don’t see people who are roughly as competent as their peers, just being lucky in which part of the research space was divvied up and given to them. You tend to see people with rare and useful thinking processes having multiple important insights about their field in succession—getting many thing right that their peers didn’t, not just one as your model would predict (if being right was random luck). Bostrom has looked into half a dozen sci-fi looking areas that others looked to figure out which were important, before concluding with xrisk and AI, and he looked into areas and asked questions that were on nobody’s radar. Feynman made breakthroughs in many different subfields, and his success looked like being very good at fundamentals like being concrete and noticing his confusion. I know less about Einstein, but as I understand it to get to Relativity required a long chain of reasoning that was unclear to his contemporaries. “How would I design the universe if I were god” was probably not a standard tool that was handed out to many physicists to try.

You may respond “sure, these people came up with lots of good ideas that their contemporaries wouldn’t have, but this was probably due to them using the right heuristics, which you can think of as having been handed out randomly in grad school to all the different researchers, so it still is random just on the level of cognitive processes”.

To this I’d say that, you’re right, looking at people’s general cognitive processes is really important, but I think I can do much better than random chance in predicting what cognitive processes will produce valuable insights. I’ll point to Superforecasters and Rationality: AI to Zombies as books with many insights into which cognitive processes are more likely to find novel and important truths than others.

In sum: I think the people who’ve had the most positive impact in history are power law distributed because of their rare and valuable cognitive processes, not just random luck, and that these can be learned from and that can guide my search for people who (in future) will have massive impact.


Jan:

Obviously the toy model is wrong in describing reality: it’s one end of the possible spectrum, where you have complete randomness. On the other you have another toy model: results in a field neatly ordered by cognitive difficulty, and the best person at a time picks all the available fruit. My actual claims roughly are

  • reality is somewhere in between

  • it is field-dependent

  • even in fields more toward the random end, there actually would be differences like different speeds of travel among prospectors

It is quite unclear to me where on this scale the relevant fields are.

I believe your conclusion, that the power law distribution is all due to the properties of the peoples cognitive processes, and no to the randomness of the field, is not supported by the scientometric data for many research fields.

Thanks for a good preemptive answer :) Yes if you are good enough in identifying the “golden” cognitive processes. While it is clear you would be better than random chance, it is very unclear to me how good you would be. *

I think its worth digging into an example in detail: if you look a at early Einstein, you actually see someone with an unusually developed geometric thinking and the very lucky heuristic of interpreting what the equations say as the actual reality. Famously special relativity transformations were written first by Poincare. “All” what needed to be done was to take it seriously. General relativity is a different story, but at that point Einstein was already famous and possibly one of the few brave enough to attack the problem.

Continuing with the same example, I would be extremely doubtful if Einstein would be picked by selection process similar to what CEA or 80k hours will be probably running, before he become famous. 2nd grade patent clerk? Unimpressive. Well connected? No. Unusual geometric imagination? I’m not aware of any LessWrong sequence which would lead to picking this as that important :) Lucky heuristic? Pure gold, in hindsight.

(*) At the end you can take this as an optimization problem depending how good your superior-cognitive-process selection ability is. Let’s have a practical example: You have 1000 applicants. If your selection ability is great enough, you should take 20 for individual support. But maybe its just good, and than you may get better expected utility if you are able to reach 100 potentially great people in workshops. Maybe you are much better than chance, but not really good… than, maybe you should create online course taking in 400 participants.


Ben (me):

Examples are totally worth digging into! Yeah, I actually find myself surprised and slightly confused by the situation with Einstein, and do make the active predictions that he had somestrong connections in physics (e.g. at some point had a really great physics teacher who’d done some research). In general I think Ramanujan-like stories of geniuses appearing from nowhere are not the typical example of great thinkers /​ people who significantly change the world. If I’m I right I should be able to tell such stories about the others, and in general I do think that great people tend to get networked together, and that the thinking patterns of the greatest people are noticed by other good people before they do their seminal work cf. Bell Labs (Shannon/​Feynman/​Turing etc), Paypal Mafia (Thiel/​Musk/​Hoffman/​Nosek etc), SL4 (Hanson/​Bostrom/​Yudkowsky/​Legg etc), and maybe the Republic of Letters during the enlightenment? But I do want to spend more time digging into some of those.

To approach from the other end, what heuristics might I use to find people who in the future will create massive amounts of value that others miss? One example heuristic that Y Combinator uses to determine who in advance is likely to find novel, deep mines of value that others have missed is whether the individuals regularly build things to fix problems in their life (e.g. Zuckerberg built lots of simple online tools to help his fellow students study while at college).

Some heuristics I use to tell whether I think people are good at figuring out what’s true, and make plans for it, include:

  • Does the person, in conversation, regularly take long silent pauses to organise their thoughts, find good analogies, analyse your argument, etc? Many people I talk to take silence as a significant cost, due to social awkwardness, and do not make the trade-off toward figuring out what’s true. I always trust the people more that I talk to who make these small trade-offs toward truth versus social cost

  • Does the person have a history of executing long-term plans that weren’t incentivised by their local environment? Did they decide a personal-project (not, like, getting a degree) was worth putting 2 years into, and then put 2 years into it?

  • When I ask about a non-standard belief they have, can they give me a straightforward model with a few variables and simple relations, that they use to understand the topic we’re discussing? In general, how transparent are their models to themselves, and are the models general simple and backed by lots of little pieces of concrete evidence?

  • Are they good at finding genuine insights in the thinking of people who they believe are totally wrong?

My general thought is that there isn’t actually a lot of optimisation process put into this, especially in areas that don’t have institutions built around them exactly. For example academia will probably notice you if you’re very skilled in one discipline and compete directly in it, but it’s very hard to be noticed if you’re interdisciplinary (e.g. Robin Hanson’s book sitting between neuroscience and economics) or if you’re not competing along even just one or two of the dimensions it optimises for (e.g. MIRI researchers don’t optimise for publishing basically at all, so when they make big breakthroughs in decision theory and logical induction it doesn’t get them much notice from standard academia). So even our best institutions at noticing great thinkers with genuine and valuable insights seem to fail at some of the examples that seem most important. I think there is lots of low hanging fruit I can pick up in terms of figuring out who thinks well and will be able to find and mine deep sources of value.

Edit: Removed Bostrom as an example at the end, because I can’t figure out whether his success in academia, while nonetheless going through something of a non-standard path, is evidence for or against academia’s ability to figure out whose cognitive processes are best at figuring out what’s surprising+true+useful. I have the sense that he had to push against the standard incentive gradients a lot, but I might just be false and Bostrom is one of academia’s success stories this generation. He doesn’t look like he just rose to the top of a well-defined field though, it looks like he kept having to pick which topics were important and then find some route to publishing on them, as opposed to the other way round.


Greg Lewis subsequently also responded to Jan’s comment:

I share your caution on the difficulty of ‘picking high impact people well’, besides the risk of over-fitting on anecdata we happen to latch on to, the past may simply prove underpowered for forward prediction: I’m not sure any system could reliably ‘pick up’ Einstein or Ramanujan, and I wonder how much ‘thinking tools’ etc. are just epiphenomena of IQ.

That said, fairly boring metrics are fairly predictive. People who do exceptionally well at school tend to do well at university, those who excel at university have a better chance of exceptional professional success, and so on and so forth. SPARC (a program aimed at extraordinarily mathematically able youth) seems a neat example. I accept none of these supply an easy model for ‘talent scouting’ intra-EA, but they suggest one can do much better than chance.

Optimal selectivity also depends on the size of boost you give to people, even if they are imperfectly selected. It’s plausible this relationship could be convex over the ‘one-to-one mentoring to webpage’ range, and so you might have to gamble on something intensive even in expectation of you failing to identify most or nearly all of the potentially great people.

(Aside: Although tricky to put human ability on a cardinal scale, normal-distribution properties for things like working memory suggest cognitive ability (however cashed out) isn’t power law distributed. One explanation of how this could drive power-law distributions in some fields would be a Matthew effect: being marginally better than competing scientists lets one take the majority of the great new discoveries. This may suggest more neglected areas, or those where the crucial consideration is whether/​when something is discovered, rather than who discovers it (compare a malaria vaccine to an AGI), are those where the premium to really exceptional talent is less. )


Jan’s last response to me:

For scientific publishing, I looked into the latest available paper[1] and apparently the data are best fitted by a model where the impact of scientific papers is predicted by Q.p, where p is “intrinsic value” of the project and Q is a parameter capturing the cognitive ability of the researcher. Notably, Q is independent of the total number of papers written by the scientist, and Q and p are also independent. Translating into the language of digging for gold, the prospectors differ in their speed and ability to extract gold from the deposits (Q). The gold in the deposits actually is randomly distributed. To extract exceptional value, you have to have both high Q and be very lucky. What is encouraging in selecting the talent is the Q seems relatively stable in the career and can be usefully estimated after ~20 publications. I would guess you can predict even with less data, but the correct “formula” would be trying to disentangle interestingness of the problems the person is working on from the interestingness of the results.

(As a side note, I was wrong in guessing this is strongly field-dependent, as the model seems stable across several disciplines, time periods, and many other parameters.)

Interesting heuristics about people :)

I agree the problem is somewhat different in areas not that established/​institutionalized where you don’t have clear dimensions of competition, or the well measurable dimensions are not that well aligned with what is important. Loooks like another understudied area.

[1] Quantifying the evolution of individual scientific impact, Sinatra et.al. Science, http://​​www.sciencesuccess.org/​​uploads/​​1/​​5/​​5/​​4/​​15543620/​​science_quantifying_aaf5239_sinatra.pd