Selection Has A Quality Ceiling
Suppose we’re working on some delightfully Hard problem—genetically engineering a manticore, or terraforming Mars, or aligning random ML models. We need very top tier collaborators—people who are very good at a whole bunch of different things. The more they’re good at, and the better they are, the better the chances of success for the whole project.
There’s two main ways to end up with collaborators with outstanding skill/knowledge/talent in many things: selection or training. Selection is how most job recruitment works: test people to see if they already have (some of) the skills we’re looking for. Training instead starts with people who don’t have (all of) the skills, and installs them de novo.
Key point of this post: selection does not scale well with the level of people we’re looking for. As we increase the number of skills-we-want in our collaborators, the fraction-of-people with all those skills shrinks exponentially, so the number-we-need-to-test grows exponentially. Training has much better asymptotic behavior: as the number of skills-we-want grows, the amount of training needed to install them grows only linearly—assuming we’re able to train them at all.
Bits Of Search
Suppose I have some test or criterion, and only half the population passes it—for instance, maybe I want someone with above-median math skills. That’s one bit of search: it eliminates half the possibilities.
If I want above-median math skills and above-median writing skills, that’s (approximately) two bits, and I expect (approximately) one-in-four people to pass both tests. (Really, math and writing skills are correlated, so it will be somewhat more than one-in-four and thus somewhat less than two bits of search.) As more skills are added to the list of requirements, adding more “bits of search”, the number of people who pass all requirements will fall exponentially. With k bits of search, only 1-in-2^k people will pass, so I’ll need to search over ~2^k people just to find one potential collaborator.
In practice, skills are not independent, but the correlation is weak enough that exponentials still kick in. (Indeed, the only way exponentials won’t kick in is if correlation increases rapidly as we add more skills.)
I also sometimes want more-than-one bit of search in just one skill. For instance, if I want someone in the top 1⁄32 of writing skill, then that’s 5 bits of search. In practice, we usually want quite a few bits in relevant skills—for instance, if I’m looking for help genetically engineering a manticore, then I’ll want people with deep expertise in developmental biology and morphogenesis. I’d probably want something like 20 bits (i.e. a one-in-a-million person) in those skills alone, plus whatever other skills I might want (e.g. good communication, quantitative thinking, etc).
Asymptotics of Selection vs Training
So, as I crank up the number of bits-of-search, the search becomes exponentially more difficult. It won’t take long before nobody in the world passes my tests—there’s only ~10B people, so ~34 bits is all I get, and that’s if I test literally everyone in the world. That puts a pretty low skill cap on potential collaborators I can find! And even before I hit the everyone-in-the-world cap, exponential growth severely limits how much I can select.
There are ways around that: skills are not independent, and sometimes I can make do with someone who has most of the skills. But the basic picture still holds: as I raise my bar, selection becomes exponentially more difficult.
Training, in principle, does not have this problem. If I want to train two independent skills, then the time required to train both of them is the sum of time required to train each, rather than a product. So, training resource requirements should generally grow linearly, rather than exponentially. Again, skills aren’t really independent, but the basic picture should still hold even when we make the model more complicated.
Problem: We Don’t Know How To Train
When we look at schools or companies, they seem to mostly select. To the extent that training does take place, it’s largely accidental: people are expected to magically pick up some skills in their first weeks or months at a new job, but there isn’t much systematic effort to make that happen efficiently/reliably.
… and for most institutions, that’s good enough. The asymptotic arguments apply to finding “very high quality” people, by whatever criteria are relevant. Most institutions neither need nor find the very best (though of course lots of them claim to do so). Most people, most of the time, work on problems-we-basically-understand. They just need to be able to use known tools in known ways, in similar ways to everyone else in their field, and about-as-well as others in their field. As long as the field is large, there are plenty of typical candidates, and selection works fine.
Selection breaks down when we need people with rare skills, and especially when we need people with many independent skills—exactly the sort of people we’re likely to need for problems-we-basically-don’t-understand.
But it still seems like training ought to be great—it should be profitable for schools or companies to install new skills in people. In some specific areas, it is profitable. So why don’t we see more of this? Here’s one theory: in order to train systematically, we need some kind of feedback loop—some way to tell whether the training is working. In other words, we need a test. Similarly, we need a test to prove to others that the training worked. And if we have a test, then we could just forget about training and instead use the test to select. As long as we’re not asking for too many bits, that’s probably cheaper than figuring out a whole training program.
So, we end up with a society that’s generally not very good at training.
Summary
Most of the world mostly “gets good people” by selection: we start with a big pool of candidates and then filter for those which best fit our criteria. But this technique puts a cap on “how good” we can select for—we can’t ask for someone better than the best in the world. Even if the number of people is effectively infinite, we still need to search over exponentially many candidates as the list of selection criteria grows.
For most institutions, this isn’t much of a problem, because they’re not “in the asymptote”—they don’t really need people with that many bits of perfection. But the Harder our problems, the more we need people with many bits—potentially people better than the current best in the world, or potentially people who are just too rare to cheaply search for in a giant pool of candidates. At that point, we have no choice but to train, rather than select.
Training is hard; it’s not a thing which most institutions know how to do well today. But if we want top-level collaborators in many skills, then we just have to figure out how to do it. Selection does not scale that way.
- The Apprentice Experiment by 10 Jun 2021 3:29 UTC; 150 points) (
- Voting Results for the 2021 Review by 1 Feb 2023 8:02 UTC; 66 points) (
- 19 Jun 2021 22:55 UTC; 18 points) 's comment on The Apprentice Thread by (
This post has tentatively entered my professional worldview. “Big if true.”
I’m looking at this through the lens of “how do we find/create the right people to help solve x-risk and other key urgent problems.” The track record of AI/rationalist training programs doesn’t seem that great. (i.e. they seem to typically work mostly via selection[1]).
In the past year, I’ve seen John attempt to make an actual training regimen for solving problems we don’t understand. I feel at least somewhat optimistic about his current training attempts, partly because his models make sense to me and partly based on his writeup of the results here. But I think we’re another couple years out before I really know how well it panned out.
I almost reviewed this post without re-reading it, but am glad I stopped to fully re-read. The mechanics/math of the how the bits-of-selection worked were particularly helpful and I’d forgotten them. One thing they highlight: you might need a lot of different skills. And maybe some of those skills are ineffable and hard to teach. But others might be much more teachable. So maybe you need to select on one hard-to-find property, but can train a lot of other skills.
Some musings on training
I’m maybe more optimistic than John about what percentage of “school” is “training”. I think maybe 10-15% of what I learned in middle/high-school was at least somewhat relevant to my longterm career, and later when I went to a trade school, I’d say closer to 50% of it was actual training, which I’d have had a harder time doing on my own. (And, my trade school created half of it’s classes out of an attempt to be an accredited university. i.e. half the classes were definitively bullshit, and the other half were basically all useful if you were going into the domain of computer-animation).
When I say “rationality training turned out to mostly be selection”, I think probably what I mean was “it didn’t create superheroes, the way HPMOR might have vaguely led you to believe.” And perhaps, “it mostly didn’t produce great researchers.” I do think the CFAR-and-Leverage-ecosystem produced a bunch of relevant skills for navigating life, which raise the sanity-and-coordination-waterline. I think it had the positive impact of “producing pretty good citizens.” I’ve heard CFAR instructors complain that mostly they don’t seem to imbue the spark of rationality into people, they only find people who already had the spark. But, it clearly IMO created an environment where people-with-that-spark cultivated it and leveled up at it.
I’ve heard grad school successful training people in the ineffable domain of research (or, the “hard-to-eff” domain of research). The thing that seems off/unsatisfactory about it, from the perspective of the x-risk-landscape, is it doesn’t really train goal directed research, where you’re actually trying to accomplish a particular task, and notice when you might be confused about how to approach it.