Let’s start with one of those insights that are as obvious as they are easy to forget: if you want to master something, you should study the highest achievements of your field. If you want to learn writing, read great writers, etc.
But this is not what parents usually do when they think about how to educate their kids. The default for a parent is rather to imitate their peers and outsource the big decisions to bureaucracies. But what would we learn if we studied the highest achievements?
Thinking about this question, I wrote down a list of twenty names—von Neumann, Tolstoy, Curie, Pascal, etc—selected on the highly scientific criteria “a random Swedish person can recall their name and think, Sounds like a genius to me”. That list is to me a good first approximation of what an exceptional result in the field of child-rearing looks like. I ordered a few piles of biographies, read, and took notes. Trying to be a little less biased in my sample, I asked myself if I could recall anyone exceptional that did not fit the patterns I saw in the biographies, which I could, and so I ordered a few more biographies.
This kept going for an unhealthy amount of time.
I sampled writers (Virginia Woolf, Lev Tolstoy), mathematicians (John von Neumann, Blaise Pascal, Alan Turing), philosophers (Bertrand Russell, René Descartes), and composers (Mozart, Bach), trying to get a diverse sample.
In this essay, I am going to detail a few of the patterns that have struck me after having skimmed 42 biographies. I will sort the claims so that I start with more universal patterns and end with patterns that are less common.
Exceptional people grow up in exceptional milieus
This seems to be true for >95 percent of the people I looked at.
These naked apes, the humans, are intensely social animals. They obsessively internalize values, ideas, skills, and desires from the people who surround them. It is therefore not surprising that those who grow up to be exceptional tend to have spent their formative years surrounded by adults who were exceptional.
Virginia Woolf never attended school. Her father, Leslie Stephen, who, along with their tutors, educated Virginia and her sister, was an editor, critic, and biographer “complicatedly hated” by his daughter and of such standing that he could invite Henry James, Thomas Hardy, and Alfred Lord Tennyson to dine and converse with his children. Leslie Stephen described his circle, in which Virginia grew up, as “most of the literary people of mark . . . clever young writers and barristers, chiefly of the radical persuasion . . . we used to meet on Wednesday and Sunday evenings, to smoke and drink and discuss the universe and the reform movement.” When they went to the Hebrides in the summers, Leslie brought along painters and philosophers, who would hang out and work in their summer house while the children played.
This parental obsession with curating a rich intellectual milieu comes through in nearly all of the biographies. As I wrote in First we shape our social graph; then it shapes us:
Michel Montaigne’s father employed only servants who were fluent in Latin, curating a classical culture, so Montaigne would learn Latin as his mother tongue. J.S. Mill spent his childhood at his father’s desk, helping his father write a treatise on economics, running over to Jeremy Bentham’s house to borrow books and discuss ideas.
Blaise Pascal, too, was homeschooled by his father. His father chose not to teach him math. (The father, Etienne, had a passion for mathematics that he felt was slightly unhealthy. He feared mathematics would distract Pascal from less intrinsically rewarding pursuits, such as literature, much like modern parents fear TikTok.) Pascal had to teach himself. When it was discovered that Pascal, then a young teenager, had rederived several of Euclid’s proofs, the family relocated to Paris so father and son could participate in the mathematical salons of Mersenne. The instinct was to curate a culture, not to teach, not primarily.
At least two-thirds of my sample was home-educated (most commonly until about age 12), tutored by parents or governesses and tutors. The rest of my sample had been educated in schools (most commonly Jesuit schools).
As children, they were integrated with exceptional adults—and were taken seriously by them. When Bertrand Russell, at five years old, refused to believe the earth was round, his grandparents didn’t laugh him off—they called in the vicar of the parish to reason Bertrand out of his misconception.
The adults had high expectations of the children; they assumed they had the capacity to understand complex topics, and therefore invited them into serious conversations and meaningful work, believing them capable of growing competent rapidly.
John von Neumann (the Hungarian physicist who at one time managed the development of the hydrogen bomb and the first digital computer, and as a pastime, at night, invented game theory) was included in the discussions of the management of his father’s bank before reaching school age.
From the notes of John’s younger brother Nicholas:
From the business visitors, at relatively formal dinners, and from father’s approach to them in the context of the activities of his banking house, we got introduced to the secrets of making business contacts and of management with executive powers in father’s banking house. This was always discussed, just as all school subjects, and analyzed in terms of father’s management of his activities through the means of delegating powers to his associates and staff.
Given your children access to observe you while you work—is, in my experience, rewarding but draining. While writing his ten-volume History of British India, John Stuart Mill’s father allowed John Stuart, who was three years old, to interrupt him every time he encountered a Greek word he had not seen before (he was reading the classics). His father considered raising his children to be of equal importance as his intellectual work.
Not everyone who grew to be exceptional was this lucky. There are a few cases of people who rose to greatness despite their non-ideal circumstances—like Ramanujan and Michael Faraday. But they, too, were the fruit of exceptional milieus. They just had to summon it themselves. How did they do that?
First, they did this by reading books, by self-teaching. Second, when they grew more skilled they started reaching out to exceptional people, trying to convince them to bring them into their milieu. Ramanujan famously sent letters to a large number of English mathematicians, until one of them, G.H. Hardy, realized that this strange kid writing letters from India was not actually a crank but a raw genius and brought him over to Cambridge. (There were also college students who lodged in Ramanujan’s house as a child in Erode, so he could possibly have been tutored by them, too.)
Faraday grew up in poverty in early 1800s London. He spent less than a year in school and then ended up as a book binder’s apprentice (the same fate as struck Benjamin Franklin). The bookbinder, George Riebau, seems to have been a decent intellectual role model, but more importantly—he gave Faraday access to books. After having read Isaac Watts’ The Improvement of the Mind, an intellectual self-help book, Faraday started attending scientific lectures where he took copious notes. He turned Humphry Davy’s lecture series into a book, bound it, and gave it to him. That, Davy thought, was a nice gesture and, after first having ruined his eyes in an experiment with nitrogen trichloride, accepted Faraday as an apprentice in his lab.
Books can, in other words, be a good stand-in for a social milieu, up to a point, but eventually, you need direct access to exceptional people. And having access to them from a young age greatly increases the likelihood that you will be shaped by them.
They had time to roam about and relied heavily on self-directed learning
~95 percent.
Britain has produced a range of remarkably gifted multidisciplinary scientists and scholars who are sometimes described as polymaths. The group included, in recent times, Bertrand Russell, A. N. Whitehead, J. B. S. Haldane, J. D. Bernal, and Jacob Bronowski. Russell commented that the development of such gifted individuals required a childhood period in which there was little or no pressure for conformity, a time in which the child could develop and pursue his or her own interests no matter how unusual or bizarre.
—Carl Sagan
This freedom from peer pressure was certainly true of Russell. He was largely kept separate from other children, living secluded in his grandparent’s aristocratic mansion, something many biographers lament (just imagine how brilliant he would have been had he just had access to schools!).
In his loneliness, Russell was also kept idle. His grandmother, who was his guardian, was, Russell writes in his autobiography, “always afraid that I should overwork, and kept my hours of lessons very short.”
The “most important hours” of his days were spent alone, walking around the gardens at Pembroke Lodge which “seemed to remember the days of its former splendor, when foreign ambassadors paced its lawns, and princes admired its trim beds of flowers” but was now growing gradually more neglected, with shrubs growing over the paths and the box hedges turning into trees.
In solitude I used to wander about the garden, alternately collecting birds’ eggs and meditation on the flight of time. If I may judge by my own recollections, the important and formative impressions of childhood rise to consciousness only in fugitive moments in the midst of childish occupations, and are never mentioned to adults. I think periods of browsing during which no occupation is imposed from without are important in youth because they give time for the formation of these apparently fugitive but really vital impressions.
Russell’s childhood seems a little depressing, as did Virginia Woolf’s. In a letter to her brother Thoby, who had been sent off to boarding school, Woolf lamented: ”I have to delve from books, painfully and alone, what you get every evening sitting over your fire and smoking your pipe with Strachey etc.”)
But this immersion in boredom is also a universal in the biographies of exceptional people. A substantial fraction were completely kept apart from other children, either because their guardians decided so or because they were bedridden with various illnesses during childhood (like Descartes). A spicy hypothesis raised by this is that socializing too much with children is simply not good for your intellectual development. (I’m not going to test that hypothesis!)
A common theme in the biographies is that the area of study which would eventually give them fame came to them almost like a wild hallucination induced by overdosing on boredom. They would be overcome by an obsession arising from within.
Mozart was drilled on the piano and violin by his father, but the compositions he undertook on his own.
Pascal, as we have already mentioned, wrote several of Euclid’s proofs after self-teaching math in his spare time.
Alan Turing, who was raised in boarding schools, also seems to have self-taught a lot of mathematics (at fifteen, he derived the inverse tangent function before having encountered calculus!) while being an outcast at school and facing resistance from the teachers, who thought his interests were not ”well-rounded”.
Another case is Maxwell, the Scottish mathematician who unified electricity and magnetism in a series of equations of such power that the Austrian physicist Boltzmann proclaimed, War es ein Gott, der diese Zeichen schrieb? Was it a God that wrote these signs?
James Clerk Maxwell grew up in relative isolation, in Glenlair, a country house on the Middlebie estate in southwest Scotland in the 1830s. At an early age, Maxwell grew fascinated by geometry and rediscovered the regular polyhedra before receiving any formal instruction. His parents tried hiring a tutor, but Maxwell, when hit over the head by his tutor, ran out into a lake and refused to come back in until his parents fired his tutor. Instead of being tutored, his first ten years were spent reading novels with his mother, discussing farm improvements with father, climbing trees, doing mischief, and exploring the fields and the woods and the birds and the beasts.
Let me sum up what I’ve said so far. A lot of care went into curating the environment around the children—fascinating guests were invited, libraries were built, machines were brought home and disassembled—but the children were left with a lot of time to freely explore the interests that arose within these milieus.
A qualified guess is that they spent between one and four hours daily in formal studies, and the rest on self-directed projects. Unlike children today, they had little access to entertainment, and so were often bored, unless they figured out a way to keep their minds occupied; the intellectual obsessions that grew into their life’s work often grew out of this boredom.
They were heavily tutored 1-on-1
All were likely tutored at some point; ~70 percent were tutored for more than an hour a day growing up. I’m basically making these numbers up; it is an informed guess.
When it comes to formal instruction, an important element is tutoring. Some do all of their formal learning this way (such as John Stuart Mill), others have it as a complement to schooling (such as Albert Einstein, who had a number of math-focused tutors outside of school). Erik Hoel, who has written a series of great essays about why we stopped making Einsteins (here, here, and here), singled out “aristocratic tutoring” as the most important factor. (In this term, Erik includes not only tutoring, in its classical sense, but also more casual interactions between children and competent adults.)
He writes:
Aristocratic tutoring was not focused on measurables. Historically, it usually involved a paid adult tutor, who was an expert in the field, spending significant time with a young child or teenager, instructing them but also engaging them in discussions, often in a live-in capacity, fostering both knowledge but also engagement with intellectual subjects and fields.
The importance of tutoring, in its more narrow definition as in actively instructing someone, is tied to a phenomenon known as Bloom’s 2-sigma problem, after the educational psychologist Benjamin Bloom who in the 1980s claimed to have found that tutored students
. . . performed two standard deviations better than students who learn via conventional instructional methods—that is, “the average tutored student was above 98% of the students in the control class.”
Simply put, if you tailor your instruction to a single individual, you can make it fit so much better to their minds, so that the average person, if tutored, would become top two in a class of a hundred. The truth is a little bit more complicated than that (and I recommend Nintil’s systematic review of the research if you want to get into the weeds), but the effect is nevertheless real and big. Tutoring is a more reliable method to impart knowledge than lectures. It is also faster.
When I worked as a teacher, I had students who were disruptive in a way that made them rarely learn anything during class. To make sure they didn’t fall behind, I would tutor them 1-on-1. And, though these were children with deep emotional problems, I found I could usually progress two to four times faster with them alone than I could with the class.
If you do this for 1-4 hours daily, you can go much deeper earlier, even more so if the child is uncommonly motivated and gifted. This also means more time for free exploration, self-directed learning and developing meaningful relationships.
Many of the tutors in the biographies are not particularly inspiring, however. Leo Tolstoy’s tutor, for example, seems a rather stereotypical teacher of the older stripe:
Next, when we came to our writing lesson, the tears kept falling from my eyes [because I wanted to be with my mother] and, making a mess on the paper [. . . my tutor] Karl was very angry. He ordered me to go down upon my knees, declared that it was all obstinacy and “puppet-comedy playing” (a favourite expression of his) on my part, threatened me with the ruler, and commanded me to say that I was sorry. Yet for sobbing and crying I could not get a word out.
This is from Tolstoy’s autobiography Childhood, written when he was 23, a book which is infamously fictionalized—but the portrayal of Karl has been described as accurate by people who knew Tolstoy’s real-life tutor, Friedrich Rössel.
Russell was also abused by several of his tutors and governesses. Maxwell, as I mentioned, escaped his.
But there were also tutors who were able to forge deep and meaningful connections with their pupils, where the learning became a shared intellectual pursuit.
John von Neumann’s father would get so excited about their discussions that if they were, say, talking about machine weaving, he would set out to find a Jacquard automatic loom they could study.
Marie Curie’s father built a laboratory in their apartment so they could study chemistry.
Mozart’s father was a devoted tutor to his children, with a deep love for music.
One of Virginia Woolf’s tutors, the classics scholar and women’s right activist Janet Case, was so dear and important to Woolf that she wrote Case’s obituary nearly 40 years later.
These inspiring tutors tend to be singled out as more important than the abusive or boring ones in the autobiographies. That can be a reflection of how the authors felt about them, not what actually caused their greatness, of course. But I think this assessment is likely right. Helping another person grow rapidly requires a deep and delicate bond, in my experience. A tutor can be demanding, expecting sincere effort from you, but if the firmness does not come from a place of respect—if they do not signal that they truly believe you are capable of more than you think—harshness is degrading. I doubt the tyrannical tutors were important in shaping long-term trajectories in the cases of Tolstoy or Russell.
Cognitive apprenticeships
~90 percent did apprentice themselves at some point. ~30 percent did so before turning 14.
Every morning after breakfast, John Stuart Mill would take a walk with his father. In his Autobiography, he writes:
My father’s health required considerable and constant exercise, and he walked habitually before breakfast, generally in the green lanes towards Hornsey. In these walks I always accompanied him, and with my earliest recollections of green fields and wild flowers, is mingled that of the account I gave him daily of what I had read the day before. To the best of my remembrance, this was a voluntary rather than a prescribed exercise. I made notes on slips of paper while reading, and from these in the morning walks, I told the story to him; for the books were chiefly histories, of which I read in this manner a great number: Robertson’s histories, Hume, Gibbon; but my greatest delight, then and for long afterwards, was Watson’s Philip the Second and Third. […] In these frequent talks about the books I read, he used, as opportunity offered, to give me explanations and ideas respecting civilization, government, morality, mental cultivation, which he required me afterwards to restate to him in my own words. He also made me read, and give him a verbal account of, many books which would not have interested me sufficiently to induce me to read them of myself[.]
These conversations were a cognitive apprenticeship. Learning through apprenticeship is one of the most powerful ways of growing skilled—but if the skills are cognitive, you have to find ways to make the thoughts visible so the apprentice can imitate them.
James would model patterns of reasoning by thinking aloud and ask John Stuart to recreate his thought, imitating the thought patterns. He would give him increasingly complex tasks (books or ideas that he wanted John Stuart to summarize and articulate), then he would scaffold John Stuart by asking questions that helped him solve the task, and he would coach and give feedback on how to improve.
(James only seems to have been able to do this on walks, however. When he tried to instruct his son in the study, he would, perhaps because of the more formal setting, use less effective pedagogies—hammering John Stuart in the head with instructions, failing to give examples or demonstrate the skills he was trying to impart—resulting in a lot of pain and frustration.)
On the walks, James would refrain from giving lectures until John Stuart had himself struggled with the problems and gotten a visceral feel for their difficulty:
Striving, even in an exaggerated degree, to call forth the activity of my faculties, by making me find out everything for myself, he gave his explanations not before, but after, I had felt the full force of the difficulties[.]
First, these tasks were made up—summaries of stories and the like. But already in his early teens, John Stuart was doing real intellectual work on the walks.
His first major contribution came at thirteen when James, who had recently finished his History of British India, decided to write a didactic treatise on Ricardo’s work on political economy.
In writing this work, James Mill leveraged the apprenticeship he had fostered with his son. He began thinking aloud about this new field, political economy, “expound each day a part of the subject”, and asked John Stuart to give him a written summary the next day. John Stuart was pretty good at this by now, but this being a work of an altogether new seriousness, it was hard work. They would spend the walks dissecting John Stuart’s summaries “which he made me rewrite over and over again until it was clear, precise, and tolerably complete”. That is: John Stuart externalized his thought, and his father corrected the thoughts and gave feedback until John Stuart’s understanding of political economy converged with his. He also sent John Stuart on walks with Ricardo himself.
When they were done, James Mill took his son’s notes and polished them into the book Elements of Political Economy. It was published the year John Stuart turned fifteen.
This type of intellectual apprenticeship is a recurring pattern in the biographies. At some point in their teenage years—and sometimes earlier—the future geniuses would apprentice themselves intellectually to someone with exceptional capacity in their field.
Russell was discovered by Whitehead, one of the world’s foremost philosophers and mathematicians, and collaborated with him all through his twenties; Pascal worked with his father; Faraday became Davy’s assistant; Euler was taken on by various members of the Bernoulli family, all extraordinary mathematicians.
At this point, they were not only learning, but also doing real intellectual work.
They were gifted children
An important factor to acknowledge is that these children did not only receive an exceptional education; they were also exceptionally gifted.
Erik Hoel, in his essays about the education of genius, indicates that tutoring matters a lot more than raw intelligence and other genetic factors. I think this claim is too strong:
Erik’s tweet sounds more extreme than it is (MIT is a selective institution); still, the outcome he predicts is highly unlikely given the observations we have.
Like most of the people sampled in this essay, John von Neumann was fiendishly gifted. He could divide eight-digit numbers in his head at six; I’m a pretty dedicated tutor to my five-year-old, and I can see no path to that type of excellence within the next twelve months.
When von Neumann entered university, George Pólya, another famous mathematician, recounts:
There was a seminar for advanced students in Zürich that I was teaching and von Neumann was in the class. I came to a certain theorem, and I said it is not proved and it may be difficult. Von Neumann didn’t say anything but after five minutes he raised his hand. When I called on him he went to the blackboard and proceeded to write down the proof. After that I was afraid of von Neumann.
If we were to clone von Neumann and for some reason distribute the clones in a random selection of American homes, few if any of them would have the quality of education the original von Neumann had. A few of them might be broken down by toxic family conditions. But the other 950 or so—if they decide to attend MIT at the same time—would probably be quite a sight. Maybe not “I’ll invent the computer, game theory, and the hydrogen bomb at the same time” levels of genius; but also not the average MIT class. And who knows, having 950 von Neumanns at the same campus might also supercharge them into world-destroying feats of genius.
The innate talent of those who grow up to be exceptional is particularly clear when it comes to those who excelled in mathematics like this. But we can see the same thing in other fields. Richard Wagner was instructed on the piano by his Latin teacher but dropped out since he was unable to understand scales. Instead, Wagner learned by transcribing theatre music by ear. Once he had reached the end of his natural abilities, he sought out a composer, Christian Gottlieb Müller, and convinced his mother to allow Müller to teach him composition. Wagner was thirteen at the time. Two years later, he was able to transcribe Mozart’s 9th symphony for piano.
I have known quite a few talented musicians, and that just never happens.
This is not to say that the peculiarities of their education were not important and (in whatever regard it fits the lives of you and your child) worth emulating. Access to exceptional role models, and dedicated, personalized education is transformational. In some cases, as with John Stuart Mill, it is possible that most of his exceptional skill can be attributed to the education, rather than innate talent.
If you want to, you can do this, too
Doing all of this—curating an exceptional milieu, providing dedicated tutoring and opportunities for apprenticeship—is hard work. You could pull it off if you put your mind to it, I trust. Though, like everything pursued to excellence, it would demand serious sacrifices. Particularly of time. It is ok not to want that.
A lot of it does not require sacrifices, though. It is just a way of viewing children: as capable of competence, as craving meaningful work, as worthy to be included in serious discussions. We can learn to view them like that, but it is a subtle and profound shift in perception, a shift away from the way we are taught to view children. When I read the biographies, it feels a little bit like getting new peers. Their way of being works on me. Gradually, I raise my aspirations.
There is a moving scene in John Stuart Mill’s biography, when John Stuart is about to set out into the world and his father for the first time lets him know that his education had been . . . a bit particular. He would discover that others his age did not know as much as he did. But, his father said, he mustn’t feel proud about that. He’d just been lucky.
Let’s make more people lucky.
I agree that children are capable of understanding complex topics, and we should take children far more seriously.
When my kids were young, I exposed them to a wide range of advanced concepts in fields like physics and philosophy. I never “pushed” my kids. Rather, the kids asked questions (usually during car rides and at bed time), and I answered them honestly. For example, when my seven-year-old asked if time travel was possible, I introduced him to the topic of special relativity. Honestly, he picked up the ideas more quickly than most adults would. I think the reason for his aptitude wasn’t that he was a “genius,” but rather that he still had enough imagination to accept a notion like time dilation. In other words, it is often easier for children to understand “big ideas” because their minds are not yet rigid and closed-off.
A subset of other moms, and even some teachers, would say things like
“You taught your son about what???” or “A child can’t possibly understand that!” or “No child is ready for that book.”
These adults were highly critical of exposing kids to advanced concepts “too early.” By contrast, my thoughts were that there is no such thing as “too early.” Yet the underestimation of children seems endemic. For instance, I remember one year at the elementary school science fair, I heard the judges discussing my son’s project as I walked to the restroom. They concluded that “It was amazing, but no child could have done that.” However, the truth was that my husband and I didn’t even understand our son’s project until he slowly explained it to us the night of the science fair. (And we couldn’t have helped with something we hadn’t even understood.)
Fortunately though, society is not doomed to forever underestimate its youngest members. I remember from my own childhood that most people thought it was ill-advised to expose children to a foreign language before high school. (The idea was that foreign languages were far too advanced for children to pick up, and children would be confused from early exposure.) Yet eventually people realized that young children are actually better at learning foreign languages than older children, let alone adults. Furthermore, if children don’t learn a word of French on their first day in French immersion school, nobody says
“I told you they were too young. That was completely over their heads. You shouldn’t push them to understand anything so difficult. Those poor kids.”
Instead, people say things like “Just having exposure to something they cannot yet understand is good. It will open their minds and prepare the way for tomorrow.”
My hope is that we will eventually take that lesson and apply it to topics outside of foreign languages.
As to my own sons, who are now young adults, early exposure to big ideas didn’t hurt them. They are both curious, philosophically-oriented, self-directed learners, who truly care about the present state and future of humanity. In fact, I wouldn’t even be here if my oldest son—who graduated from Princeton last year with a major in computer science and as many elective philosophy courses as he could fit into his schedule—hadn’t told me about the Less Wrong website in the first place.
So thank you to Karlsson for the research and insights. My stories might be anecdotal, but I think they reflect broader issues—such as the way most of society underestimates the intellect of children—and how we can independently overcome these issues, while also advocating for societal change. Our children can be exceptional if we stop holding them back.
Thanks for sharing this! That’s a beautiful anecdote. When I worked as a teacher, I would let the 6-year-olds give me questions and we’d investigate them together; we covered some pretty advanced topics: evolutionary theory, the basics of Newtonian mechanics, electricity, the atomic theory etc. The kids and parents loved it but I ended up on collision course with the some of the other teachers.
Also, I’ve taught my five year old a second langauge through immersion—which feels like a free lunch. Just show films in the other language, and speak it at home every other day, then get some friends in the language, and voila, you never have to struggle with that. She now does this on her own, trying to learn English this way by restructuring her environment.
It’s so sad that other teachers weren’t on board with the advanced topics. Some adults can’t stand it when you teach kids about topics that they don’t understand themselves. I think it’s because doing so makes the adults feel less superior to kids. Just know that you were doing the right thing (if the right thing means helping kids to love learning, and to not be afraid of any topic). And what a gift for your daughter with a second language! She is so fortunate.
What type of program did you use for your five year old for language acquisition? I want to start a similar course of language with my three year old. Were you programmatic or was it just immersion? Curious if you have any resources. For context, the language I want to teach is Spanish. It’s a second language for me and I’m reasonably fluent after a dozen years of school and several months of immersion in South America. I’m a few years out of practice, however… Thanks!
Just immersion. I did some Duolingo for myself so I would be able to speak some to her, but the rest was just letting her see films in the language like 2-3 hrs a week for two years. Then we found her friends who spoke the language—let her play with them for like 100 hrs. Now she’s pretty fluent, at the level of a native kid a year younger than her or so.
Hmmmm.
I find this super interesting, but as always I worry about selection effects.
There are many famous, successful and influential people in history. My question would be what % of those people had tutoring, cognitive apprenticeships etc...
This post chooses a number of famous people. Presumably the selection process goes something like this:
look at a list of famous people
look which ones have something written about their education
writes about those one
The problem is that those with unusual educations are more likely to have written about them. What if there are many famous/successful people who mostly had normal education
Your question seems like it should be the other way around. We don’t really care about P(tutoring | success) directly when choosing actions, but we do care about P(success | tutoring) versus P(success | not tutoring).
Unless the relevant base rates P(tutoring) and P(success) are known, P(tutoring | success) by itself tells us little.
So what I have done is altogether to rough to answer this question. But from my sample (which is basically me writing down about 30 names I can think of as exceptional and then looking at their bio), tutoring seems to have played an important part for at least 70 percent. By which I mean, they got at least an hour a day of formal tutoring from someone skilled at it. I think that is more than average.
Tutoring is not as universal as just having really smart people around to talk to, though. That is nearly universal in my sample, and is surely less common among unsuccessful people.
I don’t find your methodology for deciding when tutoring has played an important part persuasive.
In fact, even if we could show that P(success | tutoring) > P(success | not tutoring), that again by itself would tell us little because it would only be correlational evidence. Judging whether tutoring played an important part in the success of these people needs to be done using a more rigorous causal analysis, which means controlling for obvious confounders such as family wealth and genetic endowment in some form even if the study has to be observational in nature. This is impossible to do simply by reading Wikipedia articles about people who have been successful.
Again, that being less common among unsuccessful people doesn’t tell us anything of value, because
it’s only correlational evidence, not causal; and
it’s only directional evidence and doesn’t give us much information about the magnitude of the effect.
The interesting question here is about the effect size—on priors I think it’s easy to agree that having smart people to talk with during childhood would have a positive impact on your future success as an adult. However, is this a d = 0.01 effect, a d = 0.1 effect or a d = 1 effect? What’s the order of magnitude?
I would expect the effect size of childhood tutoring to be small to moderate if we could actually run this experiment or at least get good enough observational data to control for the obvious confounders, and I don’t think this position is really contradicted by the information presented in your post. As a consequence, I remain unconvinced by your central thesis.
I like your rigor—I feel too time-contained to be this systematic when I think about how to raise my kids. I would love to know how you would approach that decision—what data you would look at. And if you have kids, or know how you would raise them, I would love to know how you approach it, too. Especially the parts that contradict the patterns I noted in the sample in my essay.
There are selection effects, for sure. The process wasn’t as bad as you describe, but it was pretty bad as I describe in the post. I made the list of names (before looking up what they had written etc). I also actively looked for counterexamples to add to the list later. So the number 2⁄3′s homeschooled for example is just the number I got going through everyone. About a third did go to schools, Jesuit schools being most common—for my sample. The post itself uses a lot of colorful examples, because, that’s pretty much what I’m doing. Getting an impression.
JVN did not invent the computer or the hydrogen bomb. He contributed to the design of both, and both were group project(there were many early computers!), but teller is more responsible for the bomb and Turing and many others more responsible for digital computers . Where does this hero worship of JVN on this site come from? This tendency to assign all credit to him? Is it because he was a mentant? His mentant abilities probably overinflated his reputation in the pre computational era but would be much less useful or impressive today (and were clearly not an outsized advantage even then)
It comes from the people who worked with him. Even great minds like Teller, who you mentioned, held him in awe:
https://en.wikipedia.org/wiki/John_von_Neumann
To my knowledge, no one else in history has had such a large impact over so many fields (mathematics, physics, computer science, engineering, statistics, game theory, economics). If he had been an economist and nothing more he would still be famous.
https://en.wikipedia.org/wiki/John_von_Neumann
I would also not want to test it. But there’s a middle ground that has had more testing: socializing with kids older than you.
I attended a democratic school that had children from 4yo up to 18yo, and we were all in the same environment, free to interact. That meant that there was always someone older you can look up to and learn from. And indeed, it seems to me that kids in democratic schools are much more mature.
I should mention that Montessori education also groups together several ages, though usually not this many.
This seems to be pretty helpful for thinking in general. I’ve noticed that some of the most valuable things I noticed was not from four one-hour periods of thinking with long breaks, but a single uninterrupted 4-hour period (usually forced upon me in some way). Even with media/social media taken out of the equation, reading LW or books still counted as an interruption.
You could, but you haven’t argued that this is not a very bad idea. Almost everything you identified leads to high variance, but it doesn’t necessarily usually have positive impact. I’m not sure I’d want my kids to be ten times as likely to be exceptional at the cost of them being one hundred times as likely to be miserable—and many of the things that get recommended here are also common among dilettantes and losers, who are far more common.
Edit to add: I’m not saying it is a bad idea, but I am saying that the presentation of the idea, and the motive to promote these ideas, is fundamental to the motive and method. I think this flaw should lead us to mostly discard the filtered evidence presented.
There are a bunch of things in the post I would never do. But I doubt highly that most of the things are of a sort that is likely to lead many to be miserable. The two who are the most miserable in the sample are Russell and Woolf who were very constrained by their guardians; Mill also seems to have taken some toll by being pushed too hard. But apart from that? Curious: what do you find most high-risk apart from that?
Mind the potentially strong selection bias specifically here, though. Even if in our sample of ‘extra-successful’ people there were few (or zero) who were too adversely affected, this does not specifically invalidate a possible suspicion that the base rate of creating bad outcomes from the treatment is very high—if the latter have a small chance of ever getting to fame.
(This does not mean I disagree with your conclusions in general in any way; nice post!)
I’m positing that there is a set of people for who the various preconditions you’ve identified for being an exceptional person, and you’ve then post-hoc selected the ones who were exceptional. I wondered if it might be the case that a majority of that set, but only a minority of the chosen subset, are miserable. And the reason I think that is that some people do poorly with only self-direction.
Yes, I encourage everyone to avoid the nitpick trap. There’s plenty of good things to take from this essay. You don’t need to hire abusive tutors.
That completely misunderstands the objection—see my other response.
What specifically do you think is really high variance as opposed to the main downside being that it is expensive? If it is the ‘not going to school thing’ at least when I was growing up as a religiously homeschooled kid in the 90s, the strong impression that I got was that homeschooled kids systematically did better than other kids in terms of college success and other legible metrics—of course this has a gargantuan selection bias going on. But that does give a strong lower bound for how bad that specifically can be for kids.
The other stuff I recall from the article (ie being from a high resource background, having an intellectual mentor, being surrounded by intellectual conversations, getting one on one tutoring, good intrinsic capability) all seem to be things that either you can’t pick whether a child has or not, or where it would be weird if they left the child worse off.
One on one tutoring, for example, just doesn’t seem like a high variance thing, it seems like a positive expected value thing that might not actually be that causally important or have that big of impact, but where it will only make things worse in exceptional cases.
Isolating kids from peers is damaging to social skills in many cases. That would not show up in academic success, but it matters for happiness
Giving kids control over what they learn, and having them self-guide, is very prone to failing to pick up key skills—and some of the time, the skills are critical enough to handicap them later.
Also, “that does give a strong lower bound for how bad that specifically can be for kids”—It really doesn’t. If 25% of homeschooled kids do much better than average, and 75% do significantly worse, looking at those who went to college means you’ve completely eliminated the part of the sample that was harmed.
So this is based on my memory of homeschooling propaganda articles that I saw as a kid. But I’m pretty sure the data they had there showed most kids went to college. In my family three of us got University of California degrees, and the one who only got a nursing degree in his thirties authentically enjoyed manual labor jobs until he decided he also wanted more money.
Perhaps these numbers do stop at college, and so we don’t see in them children who get a good college education, but then fail in some important way later on in life, but I’ve never gotten an impression from anywhere that homeschooled children have generally worse life outcomes—anyways, this is something that the data has to actually exist for since several percent of US children have been homeschooled for the last several decades.
I did have substantial social problems, even as an adult, and they have led me to be less successful in career terms than I probably would have been with stronger social skills. But this might be driven by a selection effect: The reason my parents actually started homeschooling me was because I was being bullied and having severe social problems in third grade.
“this is something that the data has to actually exist for since several percent of US children have been homeschooled for the last several decades.”
Never mind. There aren’t particularly good studies. But what exists seems to say that homeschooled students do much better than average for all students, but maybe somewhat worse than the average for students with their parent’s SES backgrounds.
But the data mostly comes from non-random samples, so it is hard to generate firm conclusions.
It’s also mostly “conditional on acceptance, homeschooled students do better”—and given the selection bias in the conditional sample, that would reflect a bias against them in admissions, rather than being a fact about homeschooling.
Have you looked at Cox: https://gwern.net/doc/iq/high/1926-cox-theearlymentaltraitsof300geniuses.pdf ?
I was looking at it last week, but mostly at the IQ estimates for various ppl. Is it worth going deeper on? Does it have discussions of patterns in their environments?
I mean, that’s pretty much what Cox is doing starting pg165*, or you could skip to pg216, and the case studies would surely provide a lot of examples for you. I’d also suggest Anne Roe because your samples won’t overlap with hers and she was very interested in any childhood antecedents of world-class researchers.
* Worth remembering that the ‘genetic’ in Genetic Studies of Genius doesn’t mean ‘genes’ but ‘genesis’, as in, ‘origin’, both environmental and genetic. (Indeed, from a behavioral genetics point of view, Cox & the Terman Study are largely useless.)
I’m not seeing that much here to rule out an alternative summary: get born into a rich, well-connected family.
Now, I’m not a historian, but iirc private tutoring was very common in the gentry/aristocracy 200 years back. So most of the UK examples might not say that much other than this person was default educated for class+era.
Virginia Woolf is an interesting case in point, as she herself wrote, “A woman must have money and a room of her own if she is to write fiction.”
Great work; it seems interesting, though, as others mentioned already, the methodology appears rough. I want to add a low-confidence historical take on this.
TLDR: This sounds like learning in the ancestral environment, industrial civilizations education is shaped by statist interests
Society is using a version of the Prussian school system to this day. Said system was built to generate obedient soldiers and productive workers. Those systems domesticate humans into an industrial society. Cultures with little formal schooling often have difficulties functioning in industrial settings like factories. The very structure of school optimizes those two goals. You are to follow a superior authority figure, which historically was allowed to use excessive power and was considered right by the system. You are to do precisely what you are told, in a manner that disturbs no one, for a set time, and if you get breaks, you come back in as soon as the factory bell rings.
In contrast, learning in a hunter-gatherer society works a lot like you describe the way those exceptional people were educated. They are shaped by the social and physical environments, can roam and play a lot, and have free access to adults doing interesting things (I’m simplifying a lot of cultural limits and social conventions away, but I think in a society that was still developing those, this would be the situation), and were involved in those activities from a very early age on (a hurray for child labor).
I think that most of this learning process was and is directed toward figuring out how to function in a band of monkeys. Other monkeys are the most complex, dangerous, and valuable thing in the ancestral environment. Social isolation or making the vital qualities in those environments intellectual instead of having them be the usual status games adapts those old software modules.
I’m less sure about my model for farming societies, but you already described the ways of aristocratic education, and I think that a farming village isn’t too socially different from a tribe on this front.
Curated. While mining biographies for patterns feels more dicey than a well-constructed study, I think there’s real evidence here worth paying attention to. At the least, the patterns identified here seem worth promoting as hypotheses for ingredients of greatness. One thing that strikes me is how different these childhoods sound than the conventional ones in the modern world that consist of many hours of large-classroom schooling. I am pretty certain had I been tutored 1-1 for years, I’d have ended up knowing vastly more than I do, and ended up vastly more capable.
I am anticipate having my own child soon, and wonder how many elements here I’ll be able to offer them.
I never get tired of mentionning that Oliver Heaviside is apparently the self taught genius that created new mathematical objects to make the 20 equations of Maxwell into the 4 we know today. No idea about his childhood though but you might find interesting to read a bit about him.
Have you done a differential comparison for the social class and timeframe of each of those childhoods? That is, was their childhood environment as unique as the outcome, or was it pretty standard and there were thousands of non-exceptional adults that had similar (on these dimensions) childhood environments?
Like the advice “being very good at a sport requires dedication and training”, this analysis may miss the point that SOMETHING is enabling and rewarding that level of time and dedication (from the parents and mentors). We’re not very close to identifying and replicating that causal thing.
No.
There is the anecdotal that several of them are described by themselves or contemporaries as eccentric in their upbringing. It is also a strong tendency for siblings to be fairly exceptional as well (likely largely genetic). Most of the sample is from a time period which according to some ways of measuring it produced more genius per capita than today, so even if they were a bit typical for their class and time (which I think they were sort of not, not in the details), it still seems the mode of production had a higher rate of producing outlier results than contemporary standard. But I’m very unsure about all of this!
Modern geniuses could, on average, be more secretive because advancements beyond von Neumman’s are immensely info-hazardous.
So the rate per capita might not have changed much.
Or von Neumann and his contemporaries and predecessors stole all the insights that someone with merely Neumann’s intellect could develop independently, leaving future geniuses to have to be part of collaborative teams?
What does this mean?
oops, that was supposed to be something like ‘low hanging fruit’, I’m pretty sure it was a typo.
I still don’t understand how it’s possible for Von Neumann et al to ‘steal’ knowledge or insights. Steal from what?
Plus most of their discoveries appear to be non-rivalrous, unlike low hanging fruit.
What I meant is that it is possible the things that Von Neumann discovered were easier to discover than anything that is still undiscovered, so new Von Neumann’s won’t be as impressive.
Why wouldn’t they be?
Sure the median ‘impressiveness’ of various discoveries might change over time but whether someone’s discoveries was 5 standard deviations above average in 1953 or 5 standard deviations above average in 2023 doesn’t seem to matter?
So we can’t have less geniuses. More people means more people above 5 standard deviations (by definition?).
Even if we assume this, it does not follow that we should try to recreate the subjective conditions that led to (perceived) “success”. The environment is always changing (tech, knowledge base, tools), so many learnings will not apply. Moreover, biographies tend to create a narrative after the fact, emphasizing the message the writer want to convey.
I prefer the strategy to master the basics from previous works and then figure out yourself how to innovate and improve the state of the art.
Out of your 42 people, what were their class backgrounds, and what were the decades in which they grew up?
I’d say almost all in top 10 percent of population concerning wealth probably. Most of the sample is 1800s. It is not a very systematic sample.
I recently looked through the wikipedia list of the thirty richest Americans, and then tried to dig back into their class background (or the class background of the founder of the family fortune for heirs, like the Walton family). In almost every single case where I could identify the class background, they were from a top couple of percent background, but in only a few cases were they from an old money background. So a lot of the founders of big fortunes have backgrounds like ‘father was a lawyer/ stockbroker/ ran a grocery store/ dentist/ college professor/ middle manager’.
One interesting feature here was that there were several Russian immigrants or children of immigrants on the list (usually they moved to the US before they were a teenager, and usually they were Jewish). In these cases I found that I have generally no idea what class status is implied by the descriptions of their parent’s work in the Soviet Union. But I sort of suspect it usually was still top couple of percent.
I then looked at the European numbers, which were an interesting constrast in that:
A) A lot of the European super fortunes start with people who are as rich as far back as wikipedia tracks it. Ie the founder of the company got his money from his rich textile factory father (who doesn’t have a wikipedia account) in the late nineteenth century.
B) Weirdly, there were also more actual rags to riches stories in among the European superrich. The Zara founder is the one that stuck in my head. He seems to have been from definitely a lower two thirds of the income distribution household, and possibly even genuinely poor family in early Franco era Spain. There were several other stories that felt very much like ‘person with a totally normal family background somehow builds a giant fortune’, while again that seemed to not happen in the US listings.
I probably should make a post based on this at some point.
Re: Europe. This fits with my understanding of the wealth elite in Sweden. Sweden, surprisingly, has a very high wealth concentration, with a few dynasties controlling a large part of the banking and industry sector. However, most wildly successful individual companies—HM, IKEA, Ericsson, etc—where started by ppl in middle or lower classes. HM founders father owned a store in a small Swedish town. IKEA and Ericsson both grew up poor. Ericsson worked building railways starting age 12.
Nitpick: Childhood is not an autobiography, it’s a novel inspired by Tolstoy’s childhood.
I’m not sure that’s true, Troyat writes that he was “the most good-natured and soft-hearted man alive”. And:
Thank you for that correction!
I really liked this post. Thanks for writing it. It seems like a hugely important topic. I see it as being part of a more general question of how to improve ones productivity. It feels like there are things most people can learn from this post that plausibly would multiply their productivity. And I think that is relevant to questions of AI safety and EA.
Also relevant: Peak: Secrets from the New Science of Expertise by Anders Ericsson.
I’ve been reading about Richard Feynman recently. His dad was really awesome. He’d spend a lot of time with Richard as a kid having conversations with him, teaching him things, etc. An example that really stuck with me is the difference between knowing about a thing and knowing the name of a thing.
Interesting. I wouldn’t have predicted that in advance. Cal Newport talks about the importance of boredom too in Deep Work and Digital Minimalism.
I wonder if this was said in a tongue-in-cheek manner. I like to think that “genius” requires enough wisdom to think about alignment as well as capabilities and not do things that are “world destroying”, but I also realize that the way people commonly use the term “genius”, it isn’t really about alignment.
It sounds like this is saying “you can raise your kids this way too”. But I suspect that even as an adult, you can benefit from these sorts of things.
The book Cradles of Eminence does something similar—the authors read a massive amount of biographies of eminent people and wrote about the common threads in their childhoods. The book corroborates what has already been mentioned in this post on the importance of intellectually stimulating environments and tutoring. The book’s subjects also tended to grow up in natural settings, came into conflict with the education system and society (unclear whether this is a cause or an effect of their giftedness), experienced a disability or early frailty that set them apart from their peers and forced them to adapt, had households with a domineering parent (usually a mother) who pushed them to succeed, and experienced certain kinds of trauma or neglect.
The above is, I’m afraid, an example of Survivor bias. Famous people have biographies written about them. These novels concentrate on aspects of their lives that are salient (though not necessarily relevant). There are probably thousands (if not millions) of people who had similar upbringings but who never got famous enough for someone to write a book about them. Other examples are:
Lots of great chefs shout at their staff, therefore, to be a great chef you have to shout at your staff
Ditto great film directors
One interesting factor in the modern world is the availability of the internet. It seems possible that many kids who in the past may have been unable to get access to a good milieu irl can, in modern times, find something like it on the internet. For obvious reasons, looking at historical figures underrates this.
I agree. It doesn’t really matter the medium you use to curate your milieu. Some used letters. Most did in person. Today the internet will be a crucial tool, especially since it greatly scales the avaliability of good milieus.
Where I live, for example, there are few interesting people around. But I have been able to cultivate a strong network online, and I can give my children access to that—much like how Woolf’s father would invite his friends to dinner and talk with and in front of the kids.
Also, since a few people somewhere else in the comments have pointed out that some of the tricks they did seem stupid, for example talking latin—I must say that I find that to be an obviously good idea. Today, it would be English, rather than latin, but making sure that your kids are fluent in the lingua franca greatly increases the number of interesting people they can observe and interact with.
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
Or, more likely, these children were unusual to begin with. Foucault, for all the weakness in his thinking, understood this pretty well. Strange people can help society to change. But remember, for every odd fellow ten standard deviations up there is another odd fellow ten standard deviations down.
I absolutely agree that children should be exposed to interesting people and environments, be self-directed, be tutored, and have apprenticeships.
But given that thousands of people had these experiences contemporaneously with geniuses, and only dozens are geniuses, I think the genetics are the secret sauce.
Also, genetic geniuses with non-exceptional experiences may have been just as much a genius as the famous ones, but did not have a chance to become famous, so again the experiences help with the fame, not necessarily the base characteristic of genius.
The lesson then is if you have a child who is a genius, do these things. Unfortunately, very few of us can take advantage of that advice. Fortunately, we should all try to do these things regardless of whether our children are geniuses.
When most people had no or very little education, and some people had access to private tutors is it surprise some of them end up exceptional?
I mean you are talking about pre internet era and era where people were relatively less knowledgeable.
Today private tutoring still gives edge anyways. As you can personalize the needs of pupils.
I cannot imagine someone having several private tutors since childhood and not being exceptional in some regard or above average at least in knowledge if not intellect.
This could still be a good text if it is strongly edited. Anecdotal evidence could be a nice start of a discussion. However, the supposed percentages are as I understand, just pure speculation on part of the author.
Moreover, defining more precisely what is meant by ‘exceptional’ would be necessary if it were a more quantitative study.
If I understand the bottom line—more individualised education, more freedom (also think of car-centric worlds in which children do not roam freely and are restricted in their freedom by lack of public transport, affordable youth hostels etc.), being more in touch with physical reality (nature vs social constructs vs virtual world).
https://en.wikipedia.org/wiki/Survivorship_bias
Much like the “Maslow hierarchy of needs study,” this paper (it’s not research since it literally is a product of the author’s imagination) is biased, narrow minded and distinctly from a colonized Western Point of View, which makes it at best amusing and at worst useless to the billions of non-Western, non-white, decolonized minded people of the world. Studying what makes white privileged people successful is probably the least useful study there is. It’s kind of like studying what made Steve Jobs and Bill Gates successful—pretty useless for 99 percent of the world since these two whites men had a slew of white privilege to guide them to greatness (hard work is a euphemism for centuries of educational, economic and political privilege and advantage inherited because of your gender and skin tone…) Now studying what makes people who are the antithesis of white, male privilege, and from colonized cultures and who are successful (a relative term of course) would be a tad more relevant and interesting. Miss me with more “dead white male authors, inventors, politicians…” whom we call successful are to be emulated. Look what such atrocities that philosophy has wrought around the world. Enough. Let’s move on to study others who haven’t destroyed economies, families, communities etc., we have much to learn from them.