you can very quickly check to see if you are a natural computer programmer by pulling up a page of Python source code and seeing whether it looks like it makes natural sense
The 2006 study that claimed that humans divide neatly into “natural computer programmers” and “everyone else” failed to replicate in 2008 on a larger population of students. And it didn’t rely on students’ subjective assessment of whether code “makes natural sense”, but their measured consistency in answering questions about it.
We know from other studies that people can have highly erroneous assessments of their own ability — both in the Dunning-Kruger sense (low-skilled people overestimate greatly; high-skilled people underestimate slightly) and in the impostor-syndrome sense (high-skilled people can sometimes dramatically underestimate their skill).
In other words, if you look at a page of Python code and don’t get a subjective feeling that it makes sense, that does not place you in a population of “not natural computer programmers”.
(Disclaimer: I’m of the opinion that coding should be treated as a literacy skill — like reading, writing, and arithmetic.)
The study I think you are referring to was not about whether the two humps pattern existed—that was the initial empirical observation (contrasting with other subjects such as mathematics). Instead the study was focused on a test that the authors had hoped would distinguish the two groups in advance, which was later found not to work. That in itself does not mean that the two groups don’t exist.
edit: Although I agree that the fact that someone doesn’t understand Python right away doesn’t mean they won’t be able to learn, and I don’t think Eliezer meant to imply that—it’s just an easy enough test to offer the possibility of a quick win.
Performance being bimodal doesn’t mean that aptitude is bimodal, though. Different teaching styles may have different transfer functions, as it were.
Trivially, consider a teacher who only teaches at the level of the highest-aptitude students and allows no remedial or catch-up work for the lower-level ones, versus a teacher who only teaches at the level of the lowest-aptitude students and allows no enrichment for the gifted ones. We’d expect the former to produce bimodal levels of student performance given normally-distributed aptitudes (because almost everyone is left behind and performs very poorly, while the top students excel), while the latter would tend to produce performance distributions that were flatter than the aptitude distribution (because nobody is really allowed to excel).
I think they claimed that bimodal performance was typical for CS, which if true* would require CS teaching to be systematically biased in a way that doesn’t happen in other subjects in order to produce the effect, which strikes me as possible but unlikely.
* I wish they had given more statistics on this, IIRC it was a bit anecdotal.
In other words, if you look at a page of Python code and don’t get a subjective feeling that it makes sense, that does not place you in a population of “not natural computer programmers”.
Yep, it might even be the opposite—if you can look at a page of Python code without any previous programming experience and tell yourself that you understand it, you are way too much of a rationalizer to ever be any good at programming :P
Hah. In discussing the methodology of the “camel has two humps” study with a friend who’s an okay programmer, the idea came up that what they might have been measuring was overconfidence. People who are ignorant but overconfident would exhibit a consistent (but possibly wrong) model, whereas people who are ignorant and know it might hedge their bets by not answering consistently. Some courses and instructors (but not all) certainly do favor the overconfident student.
Not sure why the parent is so highly upvoted. Coding aptitude clearly exists, just like aptitude for math, music and writing. You can teach most people the basic of any of those, but without aptitude they will never be any good at it.
Disclaimer: I’m of the opinion that coding should be treated as a literacy skill — like reading, writing, and arithmetic.
No idea what makes you think that. Coding is a highly specialized skill not useful to most people in everyday life.
The 2006 study I’m referring to is entitled “The camel has two humps”, and attempts to establish that coding aptitude not only exists, but is bimodally distributed (“two humps”) and can be predicted accurately before the student has taken any coursework or written any code. IOW, that you can discern, pretty unambiguously, who is worth teaching before you try teaching them.
And that is what didn’t replicate.
Sure, aptitude exists. But there probably isn’t a bright line, or even a bottleneck, between natural coders and everyone else.
Coding is a highly specialized skill not useful to most people in everyday life.
That’s what they said about literacy a thousand or so years ago. If you’re not a scribe or a priest, why bother? Today, though, a person who can’t read is effectively mentally incompetent to deal with ordinary, expected situations in society. Within a hundred years, the same will be true of a person who can’t choose and apply algorithms to solve problems. It’s not about getting a job slinging Java; it’s about being able to tell (increasingly-omnipresent) machines what you want them to do for you.
The study only disproved one particular test, not all possible tests. Most likely a successful test is indeed possible if the premise of strong bimodality is true.
No idea what makes you think that. Coding is a highly specialized skill not useful to most people in everyday life.
Coding in the sense of banging out a thousand lines of C++ to validate XML or talk to a backing database or perform OCR or something is a pretty specialized skill, but scripting languages and the basics of algorithm design are at least potentially useful for anyone that does a lot of repetitive work involving computers. Which is an awful lot of people, including some you might not expect—if you’re a mechanic, for example, you can squeeze out a lot of comparative advantage if you’re good at talking to a car’s onboard computers.
I expect it to become even more generally useful as computers get smarter and more pervasive.
I expect it to become even more generally useful as computers get smarter and more pervasive.
Historically that hasn’t been the case.
When personal computers became popular (say, the 1980s) the prevalent thought was that everyone will need to know programming to make use of them, so there was a wave of putting BASIC courses into schools, etc. This turned out to be quite wrong. As time went on, you needed less and less of (any kind of) specialized knowledge to interact with computers.
I don’t see why this trend would suddenly break and reverse.
I’d distinguish between useful and necessary here. A user with no programming knowledge can clearly do a lot more now than they’d have been able to in 1993 let alone the 1980s, enabled largely by UI improvements: first the GUI revolution of the Eighties and early Nineties, then more incremental improvements as GUI idioms were refined over time, then innovations building on these trends. I expect this to continue.
If we stop there, however, we ignore the other side of the equation. A user with basic programming knowledge and the right mindset can now do everything a naive user can and much more, thanks among other things to an explosion in easily available libraries and the increasing popularity of capable high-level languages with intuitive semantics. Moreover, there are wide domains that UI changes haven’t touched and basically can’t, such as all but the simplest forms of automation: it’s a rare UI that exposes so much as a conditional outside of one-time user interaction, and the exceptions (like the Word and Excel features Gwern mentioned in a sibling post) often implement scripting languages in all but name. I expect these trends to continue, too.
Taken together, that gives us a gap in capability that’s likely to increase in absolute terms, even if the proportions narrow or remain stable. I’m betting on “stable”, myself.
Certainly a more capable user can do more than a less capable user, but that’s just restating the obvious.
I would argue that there is a more important trend here: the growth and accumulation of software—accumulation which continues to reduce the need to program something from scratch. 30 years ago if you wanted, say, a tool to convert metric amounts to imperial and back you had to write it yourself. Nowadays your task is to select one of a few dozen apps that can do it. Most needs of an average user (ones that he himself recognizes as his needs) have been met, software-wise—no need to code anything yourself.
I believe you also overestimate the willingness of the general population to get involved with programming of any kind. I happen to know a group of fairly high-level accountants. They are all smart people, computer-literate, work all day in Excel with rather complex structures, etc. A significant part of their work is automateable and would be made noticeably easier with a collection of scripts and snippets in Excel’s VBA (Visual Basic for Applications). However they refuse, explicitly and forcefully to delve into VBA using a variety of not-too-rational arguments which boil down to “we’re accountants, not programmers”. I don’t believe these people are exceptions, I think they are the rule.
Oh, and MS Office certainly implements a full-blown scripting language—the above-mentioned VBA.
I believe you also overestimate the willingness of the general population to get involved with programming of any kind.
I don’t believe I made a statement about that. I’m not trying to predict whether general computer science skills will become more common outside of formal software engineering in the future—that’s not something I’m equipped to answer. I’m saying that the potential value added by being an accountant who can code or an HR specialist who can code has increased over the last decade or so and will probably continue to do so.
I don’t know if I’m willing to agree with that. The main reason is complexity which is growing. We’re basically talking about people who are amateur coders, programming isn’t their main skill but they can do it. As the complexity of the environment increases, I’m not convinced their limited by definition skills can keep up.
There are at least two directions to this argument. One is that not many non-professional programmers are good programmers. The typical way this plays out is as follows: some guy learns a bit of Excel VBA and starts by writing a few simple macros. Soon he progresses to full-blown functions and pages of code. In a few months he automated a large chunk of his work and is happy. Until, that is, it turns out that there are some errors in his output. He tries to fix it and he can’t—two new errors pop up every time he claims to have killed one. Professionals are called in and they blanch in horror at the single 30-page function which works by arcane manipulations of a large number of global variables, all with three- or four-letter incomprehensible names, not to mention hardcoded cell references and specific numbers. The code is unsalvageable—it has to be trashed completely and all output produced by it re-done.
The second direction is concerned not with the complexity of the task, but the complexity of the environment. Consider, for example, the basic example of opening a file, copying a chunk of text from it, and pasting it into another file. That used to be easy (and is still easy if the files are local ASCII text files and you’re in Unix :-D). But now imagine that to open a file you need to interface with the company’s document storage system. You have to deal with security, privileges, and permissions. You have to deal with the versioning system. Maybe the file itself is not really a file in the filesystem but an entry in a document database. The chunk of text that you’re copying might turn out to be in Unicode and contain embedded objects. Etc., etc. And the APIs of all the layers that you’re dealing with are, of course, written for professional programmers who are supposed to know this stuff well...
I think you’re judging the hypothetical amateur programmer too harshly. So what if the code is ugly? Did the guy actually save time? Does his script make more errors than he would make if doing everything by hand? Is the 30-page function really necessary to achieve noticeable gains or could he still get a lot from sticking to short code snippets and thus avoiding the ugliness?
Similarly with the second example. Maybe some steps of the workflow will still have to be done manually. This wouldn’t fly in a professionally programmed system. But if someone is already being paid to do everything manually then as long as they can automate some steps, it’s still a win.
It’s been said that Excel is the world’s most popular programming language. A ton of people do Word macros and Excel spreadsheets without being officially ‘programmers’, though they pretty much are and may well benefit from either learning formal programming (to better understand the tools they’re using and the limits thereof or to switch to a more powerful tool); when we take into account all those people, coding looks like much less of a highly specialized skill.
There are a lot of people who want to believe that anyone can do anything, that we’re all equals in every way. One can sometimes run into really nasty attitudes when talking about intellectual differences, clear examples of fluff like “we’re all gifted” and myths like “giftedness goes away when children grow up”. Granted, it would be kind of weird to see that on LessWrong because these guys seem pretty in touch with reality when it comes to acknowledging that intellectual differences exist. Perhaps it is, instead, mind projection fallacy. Most of these guys can program, so maybe they figure most other people can learn to program the same way they did. I’ve noticed that a lot of gifted people have this problem—they have an ability, think of it as normal, and they assume average people will be able to do it.
I am not convinced that “thinking algorithmically” (whatever it means and however it is related to coding) is correlated with success or happiness or any other useful metric. I am also not sure that teaching one to write simple programs is going to make them better at thinking about their life in a systematic way. It certainly does not do it to professional programmers, in my experience.
I’m not talking about a class that teaching introduction to programming. I mean a class about analyzing everyday problems in a step-wise fashion. Teaching children how to granularize problems instead of relying on just intuitively discovering the answer or failing. In my experience schools are failing horribly at this and the children who I coached in such basics of problem solving saw across the board academic improvements.
I am confused. How exactly does the ability curve look like for mathematics? I think the skills are similar. And if not, then I would like to see a graph displaying the relation between math skills and computer skills.
Specifically, I can imagine a person good at math that for some reason never tried programming, but I have a problem imagining a person good at math who has a problem to understand making algorithms, when properly explained. Assuming that math skills follow the Gauss curve (do they?), what would make it a bimodal programming skills curve?
Good question. The bimodal distribution (vs. normal for math) was the claimed observation, but I don’t know if there’s solid statistical evidence for it existing beyond the specific classes they gave examples for or if its just anecdote. At any rate, if it’s real, no-one knows why it exists. Anecdotally, when I did my first programming class there was a guy who was much better at math than me but seemed to have a harder time grasping the concept of programming than I did.
The 2006 study that claimed that humans divide neatly into “natural computer programmers” and “everyone else” failed to replicate in 2008 on a larger population of students.
This is an incomplete and inaccurate summary of the research. Further work has been done, and a revised test shows significant success:
Abstract: A test was designed that apparently examined a student’s knowledge of assignment and sequence before a first course in programming but in fact was designed to capture their reasoning strategies. An experiment found two distinct populations of students: one could build and consistently apply a mental model of program execution; the other appeared either unable to build a model or to apply one consistently. The first group performed very much better in their end-ofcourse examination than the second in terms of success or failure. The test does not very accurately predict levels of performance, but by combining the result of six replications of the experiment, five in UK and one in Australia. We show that consistency does have a strong effect on success in early learning to program but background programming experience, on the other hand, has little or no effect.
The previous research and the test itself can be found on this page.
The 2006 study that claimed that humans divide neatly into “natural computer programmers” and “everyone else” failed to replicate in 2008 on a larger population of students. And it didn’t rely on students’ subjective assessment of whether code “makes natural sense”, but their measured consistency in answering questions about it.
We know from other studies that people can have highly erroneous assessments of their own ability — both in the Dunning-Kruger sense (low-skilled people overestimate greatly; high-skilled people underestimate slightly) and in the impostor-syndrome sense (high-skilled people can sometimes dramatically underestimate their skill).
In other words, if you look at a page of Python code and don’t get a subjective feeling that it makes sense, that does not place you in a population of “not natural computer programmers”.
(Disclaimer: I’m of the opinion that coding should be treated as a literacy skill — like reading, writing, and arithmetic.)
I think that much of the time what’s actually going on is that they dramatically overestimate everyone else’s skill.
The study I think you are referring to was not about whether the two humps pattern existed—that was the initial empirical observation (contrasting with other subjects such as mathematics). Instead the study was focused on a test that the authors had hoped would distinguish the two groups in advance, which was later found not to work. That in itself does not mean that the two groups don’t exist.
edit: Although I agree that the fact that someone doesn’t understand Python right away doesn’t mean they won’t be able to learn, and I don’t think Eliezer meant to imply that—it’s just an easy enough test to offer the possibility of a quick win.
Performance being bimodal doesn’t mean that aptitude is bimodal, though. Different teaching styles may have different transfer functions, as it were.
Trivially, consider a teacher who only teaches at the level of the highest-aptitude students and allows no remedial or catch-up work for the lower-level ones, versus a teacher who only teaches at the level of the lowest-aptitude students and allows no enrichment for the gifted ones. We’d expect the former to produce bimodal levels of student performance given normally-distributed aptitudes (because almost everyone is left behind and performs very poorly, while the top students excel), while the latter would tend to produce performance distributions that were flatter than the aptitude distribution (because nobody is really allowed to excel).
I think they claimed that bimodal performance was typical for CS, which if true* would require CS teaching to be systematically biased in a way that doesn’t happen in other subjects in order to produce the effect, which strikes me as possible but unlikely.
* I wish they had given more statistics on this, IIRC it was a bit anecdotal.
Yep, it might even be the opposite—if you can look at a page of Python code without any previous programming experience and tell yourself that you understand it, you are way too much of a rationalizer to ever be any good at programming :P
http://lesswrong.com/lw/2vb/vanity_and_ambition_in_mathematics/2scr
Hah. In discussing the methodology of the “camel has two humps” study with a friend who’s an okay programmer, the idea came up that what they might have been measuring was overconfidence. People who are ignorant but overconfident would exhibit a consistent (but possibly wrong) model, whereas people who are ignorant and know it might hedge their bets by not answering consistently. Some courses and instructors (but not all) certainly do favor the overconfident student.
Not sure why the parent is so highly upvoted. Coding aptitude clearly exists, just like aptitude for math, music and writing. You can teach most people the basic of any of those, but without aptitude they will never be any good at it.
No idea what makes you think that. Coding is a highly specialized skill not useful to most people in everyday life.
The 2006 study I’m referring to is entitled “The camel has two humps”, and attempts to establish that coding aptitude not only exists, but is bimodally distributed (“two humps”) and can be predicted accurately before the student has taken any coursework or written any code. IOW, that you can discern, pretty unambiguously, who is worth teaching before you try teaching them.
And that is what didn’t replicate.
Sure, aptitude exists. But there probably isn’t a bright line, or even a bottleneck, between natural coders and everyone else.
That’s what they said about literacy a thousand or so years ago. If you’re not a scribe or a priest, why bother? Today, though, a person who can’t read is effectively mentally incompetent to deal with ordinary, expected situations in society. Within a hundred years, the same will be true of a person who can’t choose and apply algorithms to solve problems. It’s not about getting a job slinging Java; it’s about being able to tell (increasingly-omnipresent) machines what you want them to do for you.
The study only disproved one particular test, not all possible tests. Most likely a successful test is indeed possible if the premise of strong bimodality is true.
Coding in the sense of banging out a thousand lines of C++ to validate XML or talk to a backing database or perform OCR or something is a pretty specialized skill, but scripting languages and the basics of algorithm design are at least potentially useful for anyone that does a lot of repetitive work involving computers. Which is an awful lot of people, including some you might not expect—if you’re a mechanic, for example, you can squeeze out a lot of comparative advantage if you’re good at talking to a car’s onboard computers.
I expect it to become even more generally useful as computers get smarter and more pervasive.
Historically that hasn’t been the case.
When personal computers became popular (say, the 1980s) the prevalent thought was that everyone will need to know programming to make use of them, so there was a wave of putting BASIC courses into schools, etc. This turned out to be quite wrong. As time went on, you needed less and less of (any kind of) specialized knowledge to interact with computers.
I don’t see why this trend would suddenly break and reverse.
I’d distinguish between useful and necessary here. A user with no programming knowledge can clearly do a lot more now than they’d have been able to in 1993 let alone the 1980s, enabled largely by UI improvements: first the GUI revolution of the Eighties and early Nineties, then more incremental improvements as GUI idioms were refined over time, then innovations building on these trends. I expect this to continue.
If we stop there, however, we ignore the other side of the equation. A user with basic programming knowledge and the right mindset can now do everything a naive user can and much more, thanks among other things to an explosion in easily available libraries and the increasing popularity of capable high-level languages with intuitive semantics. Moreover, there are wide domains that UI changes haven’t touched and basically can’t, such as all but the simplest forms of automation: it’s a rare UI that exposes so much as a conditional outside of one-time user interaction, and the exceptions (like the Word and Excel features Gwern mentioned in a sibling post) often implement scripting languages in all but name. I expect these trends to continue, too.
Taken together, that gives us a gap in capability that’s likely to increase in absolute terms, even if the proportions narrow or remain stable. I’m betting on “stable”, myself.
Certainly a more capable user can do more than a less capable user, but that’s just restating the obvious.
I would argue that there is a more important trend here: the growth and accumulation of software—accumulation which continues to reduce the need to program something from scratch. 30 years ago if you wanted, say, a tool to convert metric amounts to imperial and back you had to write it yourself. Nowadays your task is to select one of a few dozen apps that can do it. Most needs of an average user (ones that he himself recognizes as his needs) have been met, software-wise—no need to code anything yourself.
I believe you also overestimate the willingness of the general population to get involved with programming of any kind. I happen to know a group of fairly high-level accountants. They are all smart people, computer-literate, work all day in Excel with rather complex structures, etc. A significant part of their work is automateable and would be made noticeably easier with a collection of scripts and snippets in Excel’s VBA (Visual Basic for Applications). However they refuse, explicitly and forcefully to delve into VBA using a variety of not-too-rational arguments which boil down to “we’re accountants, not programmers”. I don’t believe these people are exceptions, I think they are the rule.
Oh, and MS Office certainly implements a full-blown scripting language—the above-mentioned VBA.
I don’t believe I made a statement about that. I’m not trying to predict whether general computer science skills will become more common outside of formal software engineering in the future—that’s not something I’m equipped to answer. I’m saying that the potential value added by being an accountant who can code or an HR specialist who can code has increased over the last decade or so and will probably continue to do so.
I don’t know if I’m willing to agree with that. The main reason is complexity which is growing. We’re basically talking about people who are amateur coders, programming isn’t their main skill but they can do it. As the complexity of the environment increases, I’m not convinced their limited by definition skills can keep up.
There are at least two directions to this argument. One is that not many non-professional programmers are good programmers. The typical way this plays out is as follows: some guy learns a bit of Excel VBA and starts by writing a few simple macros. Soon he progresses to full-blown functions and pages of code. In a few months he automated a large chunk of his work and is happy. Until, that is, it turns out that there are some errors in his output. He tries to fix it and he can’t—two new errors pop up every time he claims to have killed one. Professionals are called in and they blanch in horror at the single 30-page function which works by arcane manipulations of a large number of global variables, all with three- or four-letter incomprehensible names, not to mention hardcoded cell references and specific numbers. The code is unsalvageable—it has to be trashed completely and all output produced by it re-done.
The second direction is concerned not with the complexity of the task, but the complexity of the environment. Consider, for example, the basic example of opening a file, copying a chunk of text from it, and pasting it into another file. That used to be easy (and is still easy if the files are local ASCII text files and you’re in Unix :-D). But now imagine that to open a file you need to interface with the company’s document storage system. You have to deal with security, privileges, and permissions. You have to deal with the versioning system. Maybe the file itself is not really a file in the filesystem but an entry in a document database. The chunk of text that you’re copying might turn out to be in Unicode and contain embedded objects. Etc., etc. And the APIs of all the layers that you’re dealing with are, of course, written for professional programmers who are supposed to know this stuff well...
I think you’re judging the hypothetical amateur programmer too harshly. So what if the code is ugly? Did the guy actually save time? Does his script make more errors than he would make if doing everything by hand? Is the 30-page function really necessary to achieve noticeable gains or could he still get a lot from sticking to short code snippets and thus avoiding the ugliness?
Similarly with the second example. Maybe some steps of the workflow will still have to be done manually. This wouldn’t fly in a professionally programmed system. But if someone is already being paid to do everything manually then as long as they can automate some steps, it’s still a win.
It’s been said that Excel is the world’s most popular programming language. A ton of people do Word macros and Excel spreadsheets without being officially ‘programmers’, though they pretty much are and may well benefit from either learning formal programming (to better understand the tools they’re using and the limits thereof or to switch to a more powerful tool); when we take into account all those people, coding looks like much less of a highly specialized skill.
There are a lot of people who want to believe that anyone can do anything, that we’re all equals in every way. One can sometimes run into really nasty attitudes when talking about intellectual differences, clear examples of fluff like “we’re all gifted” and myths like “giftedness goes away when children grow up”. Granted, it would be kind of weird to see that on LessWrong because these guys seem pretty in touch with reality when it comes to acknowledging that intellectual differences exist. Perhaps it is, instead, mind projection fallacy. Most of these guys can program, so maybe they figure most other people can learn to program the same way they did. I’ve noticed that a lot of gifted people have this problem—they have an ability, think of it as normal, and they assume average people will be able to do it.
Thinking algorithmically should be a basic course taught in school. Many people muddle through a life filled with magic and not causation.
I am not convinced that “thinking algorithmically” (whatever it means and however it is related to coding) is correlated with success or happiness or any other useful metric. I am also not sure that teaching one to write simple programs is going to make them better at thinking about their life in a systematic way. It certainly does not do it to professional programmers, in my experience.
edit: I see that you were responding to the claim that coding specifically should be what is taught. I retract my objection to your objection.
http://www.psychologytoday.com/blog/in-one-lifespan/201210/critical-thinking-and-real-world-outcomes
I’m not talking about a class that teaching introduction to programming. I mean a class about analyzing everyday problems in a step-wise fashion. Teaching children how to granularize problems instead of relying on just intuitively discovering the answer or failing. In my experience schools are failing horribly at this and the children who I coached in such basics of problem solving saw across the board academic improvements.
I am confused. How exactly does the ability curve look like for mathematics? I think the skills are similar. And if not, then I would like to see a graph displaying the relation between math skills and computer skills.
Specifically, I can imagine a person good at math that for some reason never tried programming, but I have a problem imagining a person good at math who has a problem to understand making algorithms, when properly explained. Assuming that math skills follow the Gauss curve (do they?), what would make it a bimodal programming skills curve?
Good question. The bimodal distribution (vs. normal for math) was the claimed observation, but I don’t know if there’s solid statistical evidence for it existing beyond the specific classes they gave examples for or if its just anecdote. At any rate, if it’s real, no-one knows why it exists. Anecdotally, when I did my first programming class there was a guy who was much better at math than me but seemed to have a harder time grasping the concept of programming than I did.
This is an incomplete and inaccurate summary of the research. Further work has been done, and a revised test shows significant success:
Meta-analysis of the effect of consistency on success in early learning of programming (pdf)
The previous research and the test itself can be found on this page.