Teaching the Unteachable

Eliezer YudkowskyMar 3, 2009, 11:14 PM

55 points

Expertise (topic)Education Inferential Distance Distillation & Pedagogy

Previously in series: Unteachable Excellence
Followup to: Artificial Addition

The literary industry that I called “excellence pornography” isn’t very good at what it does. But it is failing at a very important job. When you consider the net benefit to civilization of Warren Buffett’s superstar skills, versus the less glamorous but more communicable trick of “reinvest wealth to create more wealth”—there’s hardly any comparison. You can see how much it would matter, if you could figure out how to communicate just one more skill that used to be a secret sauce. Not the pornographic promise of consuming the entire soul of a superstar. Just figuring out how to reliably teach one more thing, even if it wasn’t everything...

What makes a success hard to duplicate?

Naked statistical chance is always incommunicable. No matter what you say about your historical luck, you can’t teach someone else to have it. The arts of seizing opportunity, and exposing yourself to positive randomness, are commonly underestimated; I’ve seen people stopped in their tracks by “bad luck” that a Silicon Valley entrepreneur would drive over like a steamroller flattening speed bumps… Even so, there is still an element of genuine chance left over.

Einstein’s superstardom depended on his genetics that gave him the potential to learn his skills. If a skill relies on having that much brainpower, you can’t teach it to most people… Though if the potential is one-in-a-million, then six thousand Einsteins around the world would be an improvement. (And if we’re going to be really creative, who says genes are incommunicable? It just takes more advanced technology than a blackboard, that’s all.)

So when we factor out the genuinely unteachable—what’s left? Where you can you push the border? What is it that might be possible to teach—albeit perhaps very difficult—and isn’t being taught?

I was once told that half of Nobel laureates were the students of other Nobel laureates. This source seems to assert 155 out of 503. (Interestingly, the same source says that the number of Nobel laureates with Nobel “grandparents” (teachers of teachers) is just 60.) Even after discounting for cherry-picking of students and political pull, this suggests to me that you can learn things by apprenticeship—close supervision, free-form discussion, ongoing error correction over a long period of time—that no Nobel laureate has yet succeeding in putting into any of their many books.

What is it that the students of Nobel laureates learn, but can’t put into words?

This subject holds a fascination for me, because how it delves into the meta, the source behind, the gap between the output and the generator. We can explain Einstein’s General Relativity to students, but we can’t make them Einstein. (If you look at it from the right angle, the whole trick of human intelligence is just an incommunicable insight that humans have and can’t explain to a computer.)

The amount of wordless intelligence in our work tends to be underestimated because the words themselves are so much easier to introspect on. But when I’m paying attention, I can see how much of my searchpower takes place in fast flashes of perception that tell me what’s important, which thought to think next.

When I met my apprentice Marcello he was already better at mathematical proof than myself, certainly much faster. He’d competed at the national level—but in competitions like that you get told which problems are important. (And also in competitions, you instantly hand in the problem when you’re done, and rush on to the next one; without looking over your proof to see if you can simplify it, see it at a glance, learn something more.) But the really critical thing I was trying to teach him—testing to see if it could even be taught at all—was this sense of which AI problems led somewhere. “You can pedal as well as I can,” I said to him early on when he asked how he was doing, “but I’m still doing ninety percent of the steering.” And it was a constant, tremendous struggle to put anything into words about why I thought that we hadn’t yet found the really important insight that was lurking somewhere in a problem, and so we were going to discard Marcello’s current proof and reformulate the problem and try again from another angle, to see if this time we would really understand something.

We go through our life events, and our brain uses an opaque algorithm to grind the experiences to grist, and outputs yet another opaque neural net of circuitry: the procedural skill, the source of wordless intuitions that you know so fast you can’t see yourself knowing them. “The zeroth step”, I called it, the step in reasoning that comes before the first step and goes by so quickly that you don’t realize it’s there.

I pride myself on being good at putting things into words, at being able to introspect on the momentary flashes and see their pattern and trend, even if I can’t print out the circuitry that is their source. But when I tried to communicate my cutting edge, the borderline where I advanced my knowledge—then my words were defeated, and I was left working with Marcello on problem after problem, hoping his brain would pick up that unspoken rhythm of the steering: Turn left, turn right; this is probably worth pursuing, this is not; this seems like a valuable insight, this is just a black box around our ignorance.

I’d expected it to go like that; I’d never had the delusion that the most important parts of thought would be easy to put in words. If it were that simple we really would have had Artificial Intelligence in the 1970s.

Civilization gets by on teaching the output of the generator without teaching the generator. Einstein output his various discoveries, and then the generated knowledge is verbal enough to be passed on to students in university. When another Einstein is needed, civilization just holds its breath and hopes.

But if these wordless skills are the product of experience—then why not communicate the experiences? Or if fiction isn’t good enough, and it probably isn’t even close, then why not duplicate the experiences—put people through the same events?

(1) Superstars may not know what their critical experiences were.

(2) The critical experiences may be difficult to duplicate—for example, everyone already knows the answer to Special Relativity, and now we can’t train people by giving them the problem of Special Relativity. Just knowing that it has something to do with space and time shifting around, is already too much of a spoiler. The really important part of the problem is the one where you stare at a blank sheet of paper until drops of blood form on your forehead, trying to figure out what to think next. The skills of genius are rare, I’ve suggested, because there is not enough opportunity to practice them.

(3) There may be luck or genetic talent involved in your brain hitting on the right thing to learn—finding a solution of high quality in the space of wordless procedural skills. Even if we put you through the same experiences, there’s components of true chance and genetic talent left over in having your brain learn the same wordless skill.

But I think there’s still reason to go on trying to describe the indescribable and teach the unteachable.

Consider the transition in gambling skill associated with the invention of probability theory a few centuries back. There’s still a leftover art to poker, wordless skills that poker superstars can only partially describe in words. But go back far enough, and no one would have any idea how to calculate the odds of rolling three dice and coming up with all ones. And maybe an experienced enough gambler would have a wordless intuition that some things were likelier than others, but they couldn’t have put it into words—couldn’t have told anyone else what they’d learned about the chances; except, maybe, through a long process of watching over an apprentice’s shoulder and supervising their bets.

The more we learn about a domain, and the more we systematically observe the stars at work, and the more we learn about the human mind in general, the more we can hope for new skills to make the transition from unteachable to apprenticeable to publishable.

And you can hope to trailblaze certain paths, even if you can’t set down all the path in words. Even if you yourself got somewhere through luck (including genetic luck), you can hope to diminish the role of luck on future occasions:

(A) Warning against blind alleys that delayed you, is one obvious sort of help.

(B) If you lay down a set of thoughts that are the product of wordless skills, someone reading through the set of thoughts may find their brain picking up the rhythm, making the leap to the unspoken thing behind; and this might require less luck than the events that led to your own original acquisition of those wordless skills.

(C) There are good attractors in the solution-space—clustered sub-solutions which make it easier to reach other solutions in the same attractor. Then—even if some of the thoughts can’t be put into words, and even if it took a lot of luck to wander into the attractor the first time through—describing everything that can be put into words, may be enough to anchor the attractor.

(D) Some important experiences are duplicable: for example, you can advise people what areas to study, what books to read.

(E) And finally, the simple advance of science may just describe a domain better, so that you realize what it is you know, and are suddenly able to communicate it outright.

And of course the punchline is that this is the transition I hope to see in certain aspects of human rationality—skills which have been up until now unteachable, or only passed down from master to apprentice. We’ve learned a lot about the domain in the past few decades, and I think it’s time to take another shot at systematizing it.

I aspire to diminish the role of luck and talent in producing rationalists of a higher grade.

What links here?

Eliezer YudkowskyMar 3, 2009, 11:14 PM

55 points

18 comments6 min readLW link Archive

Expertise (topic)Education Inferential Distance Distillation & Pedagogy

kurige Mar 6, 2009, 5:48 PM
11 points


I was once told that half of Nobel laureates were the students of other Nobel laureates. … Even after discounting for cherry-picking of students and political pull, this suggests to me that you can learn things by apprenticeship—close supervision, free-form discussion, ongoing error correction over a long period of time—that no Nobel laureate has yet succeeding in putting into any of their many books.

What is it that the students of Nobel laureates learn, but can’t put into words?

You can’t put mentornship in a book. When I face a problem that may or may not have a solution I find it useful to convince myself that there is a solution, and that I only need to find a path to it. Once I eliminate the doubt or fear that I might be wasting time I’m able to concentrate on the problem at hand. If you define “success” as a problem that may or may not have a solution (ie. you may or may not be able to achieve it) then studying under a super-star may give you a psychological edge over others in the same field. It’s a form of tacit permission by which you subconsciously feel entitled to success and may be more likely to take gainful risks or less likely to simply give up.
PhilGoetz Mar 5, 2009, 9:59 PM
8 points

There must also be a name-brand recognition factor, and a networking factor. If you study under a Nobel laureate, many doors will open for you. Your advisor’s recommendation is very strong. You will make many other valuable contacts. You will have the best equipment. You will not have to teach classes, because your advisor is drowning in grant money. You come from an elite university, because Nobel laureates work at elite universities. Your grant applications and job applications will always be put on the top of the inbox pile.

Also, I think the figures should be broken down into pre-WW2 and post-WW2. Pre-WW2, there were not many great research labs in most fields. It might have been unusual to study physics at an elite university in 1930 and NOT be studying under a Nobel laureate.
MichaelHoward Mar 4, 2009, 12:29 AM
5 points


What is it that the students of Nobel laureates learn, but can’t put into words?

I think an important thing is being influenced by their influences - their actual bookshelf, the people they talk to, the places they go to think. the way their whole environment is woven together.

Along with all the subconscious communicating that will be going on when you get used to working with someone, and having access to them to ask questions all the time, plus their random notes and sketches scattered around, you could get pretty in tune after a while.

How to reproduce that remotely is a very good question.
Jonathan_Graehl Mar 3, 2009, 11:58 PM
5 points

It’s not only the skills a Nobel laureate imparts to his pupils that makes them future winners, but the quality of those pupils that win the competition for such an apprenticeship, and the political advantage gained by the high-prestige association—notwithstanding Nobel prizes being often awarded after most of the mentoring has already been done.

That said, 30% (or 50%) is enough to support your argument. I doubt that most Nobel winners are genius talent evaluators, and I doubt that winning a Nobel is all politics.
- Jack Mar 4, 2009, 9:31 AM
  4 points
  Parent
  
  This is my thought too. And I actually think you’re underestimating the role that selection plays. Higher level academia is actually very good and finding talent and the talented students and the talented professors all flock to the same institutions both for independent reasons (funding) and because they prefer the company of one another. You do not do grad work under a Nobel prize winner unless everyone in your field has already notices you and thinks it somewhat possible that you could one day win a Nobel prize. I’m actually astonished the number of Nobel prize winners who worked under other prize winners is as LOW as Eliezer says.
  
  That doesn’t mean there isn’t some method to genius that could be taught but I haven’t see evidence that there is anything that can be taught.
  - Vladimir_Nesov Mar 4, 2009, 11:04 AM
    4 points
    Parent
    
    
    You do not do grad work under a Nobel prize winner unless everyone in your field has already notices you and thinks it somewhat possible that you could one day win a Nobel prize.
    
    Read literally, this sounds completely unrealistic.
    - Jack Mar 4, 2009, 9:28 PM
      1 point
      Parent
      
      Yes. It is not literally true. Nonetheless, I’d bet students of Nobel winners almost always show significant promise. Moreover, they’re likely working in areas where Nobel prizers are likely to be won- what I mean is, there are some areas in any given field where work is likelier to yield a Nobel, even controlling for the quality of the work. In Physics, for example, Nobel’s are rarely awarded for the more theoretical work on less established subjects. So since the students of Nobel winners are usually in the same fields it makes sense that they would have a high then average likelihood of winning one as well.
nazgulnarsil Mar 4, 2009, 12:43 AM
4 points

if you have a choice between marketing something that educates people and something that allows them to be more lazy which would you choose? millions of others have made the obvious choice and the result is what you see.

I posit that a private rationalist school would produce people that outperform others on average by a ridiculous margin. But it would take you 20 years to prove it.
StuartBuck Mar 5, 2009, 6:23 PM
3 points

This very much reminds me of Michael Polanyi’s notion of the ubiquity of “tacit knowledge.” See his book “Personal Knowledge.”
Roko Mar 4, 2009, 5:29 PM
3 points

I’d definitely like to hear more about this, Eliezer.

One good intellectual habit that I think I have over other people, and I think you have expressed this too, is that I don’t separate different aspects of my knowledge as much as other people in academia do. To most people I have met, mathematics, physics, AI, philosophy, psychology and romance are separate magesteria. To me, they are all part of a unified whole. When I am in a social situation and something unexpected happens [for example I misunderstand someone, someone upsets me, etc] my brain will start analyzing the event as an AI problem and an evo-psych problem. This is something I did before I found overcoming bias.
- Fetterkey Mar 5, 2009, 5:10 PM
  5 points
  Parent
  
  Concur. The most effective people I’ve known have combined a fair degree of intelligence and knowledge with a distinct integrative facility. Compartmentalization can at times be a useful tool for simplifying a problem, but in other cases, it can blind you to potential unconventional solutions.
Annoyance Mar 4, 2009, 6:51 PM
1 point

“The literary industry that I called “excellence pornography” isn’t very good at what it does. ”

No, it’s great at what it does. It’s not very good at what it represents itself as attempting.
- Vladimir_Nesov Mar 4, 2009, 7:31 PM
  7 points
  Parent
  
  This point applies universally to everything, and as a result it’s vacuous. Anything is the best at being what it actually is.
  - Annoyance Mar 5, 2009, 6:25 PM
    11 points
    0
    Parent
    
    Yes, it’s a vacuous truth, which is why I object to its negation being offered as a reasonable statement.
    
    Let’s rephrase: excellence pornography is terrible at what it claims to do, but is excellent at what it is intended to do: get people to buy lots of it without ultimately reducing the market for itself.
    - Vladimir_Nesov Mar 6, 2009, 12:12 PM
      0 points
      0
      Parent
      
      Take a look at what happened once more: you objected to your own misinterpretation of the original statement with its correct interpretation.
  - thomblake Mar 5, 2009, 9:34 PM
    3 points
    Parent
    
    I don’t think that’s obvious. Remember that all observations are theory-laden. A bad hammer isn’t a (really good (bad hammer)) it’s just a (bad (hammer)). Once we establish what something is actually for, it can be evaluated on its merits.
scientism Mar 4, 2009, 1:54 PM
1 point

I think this is kind of a backwards way of looking at things. All scientists go through a period of apprenticeship and very little of what it is to “do science” is written down. Textbooks contain descriptions of phenomena and experiments. There are protocols for performing common tasks. But there really isn’t an extensive literature on “how to be a scientist.” But I don’t see why we should expect it to be communicable anyway. Why should we be able to provide a casual description of what people do? I think this expectation relies on the fallacy that explicit language is a mere translation of some internal “mentalese.” Yet there’s no reason to expect that language can capture thought or even behavior on anything but a technical level (i.e., a level not immediately useful to communicating practice). And even if we could express thought and behavior appropriately there’s no reason to expect us to be adept at turning verbal descriptions back into thought and behavior. If the cognitive and behavioral sciences do manage to inform pedagogy I expect it will be in the form of providing better hands-on experiences and better apprenticeships rather than finding ways to express these ideas in textbooks.
- Vladimir_Nesov Mar 4, 2009, 7:17 PM
  1 point
  Parent
  
  There is a reason to achieve reliable communication, even if there is no reason to expect it to spontaneously emerge from the usual ways.