What Direct Instruction is
A couple of days ago, prompted by several recent posts by Owen_Richardson, I checked out the book “Theory of Instruction” (Engelmann and Carnine, 1982) from my university library and promised to read it this weekend and write a post about Direct Instruction. This is that post.
Learning through examples
Direct Instruction is based on a theory of learning that assumes the learner capable of extracting a concept inductively through examples of that concept. I may not know what a blegg is, but after you show me several examples of bleggs and rubes, I will be able to figure it out. The principle of DI is to use the same basic procedure of giving examples to teach every concept imaginable. Naturally, in some cases, the process might be sped up by giving an explanation first; furthermore, there are some things in every subject you just have to memorize, and DI doesn’t magically change that. However, it is assumed that the examples are where the real learning occurs.
The meat of the theory is using experimental data and cognitive science to establish rules for how examples ought to be given. Here are a few of the more basic ones:
It is impossible to demonstrate a concept using positive examples alone. Here I am reminded of the 2-4-6 game, in which subjects fail to test triplets that disconfirm their hypothesis. A teacher has control over the examples presented, so it is important to disconfirm the hypotheses that the learners (consciously or unconsciously) generate.
To successfully teach a quality, it is important that all positive examples only share that one quality. Imagine that you are being taught what a blegg is by a sequence of examples that include blue eggs and red cubes. By the end, you will not be certain whether the defining feature of a blegg is that it’s blue, or that it’s an egg, or both at once, or if the critical factor is the vanadium ore content of an object.
The way the example is presented is also a quality that must be controlled in this fashion. This is because inductive learning is not entirely a deliberate process on the part of the learner. For instance, if positive and negative examples alternate, the learner may extract the rule that “every other object is a blegg”. There are multiple ways this can become a real problem: I’ve encountered calculus students who were confused by a problem that asked them to integrate with respect to a variable called “t”, rather than “x”.
The examples must be followed by tests, which fall in the range of given examples but are not identical. This is the way to diagnose the learning process, and is the reason that you get ideas such as “DI is about asking the students 10 questions a minute.” This is not a defining feature of DI, but you can see now that it can easily happen when the concept being taught is a simple one.
I don’t mean to imply that DI is restricted to dealing with yes-or-no identification questions. The examples and concepts can get more complicated, and there is a classification of concepts as comparative, multi-dimensional, joining, etc. This determines how the examples should be presented, but I won’t get into the classification here. In practice, a lot of concepts are taught through several sequences of examples. For instance, teaching integration by substitution might first involve a simple sequence of examples about identifying when the method is appropriate, then a sequence about choosing the correct substitution, before actually teaching students to solve an integration problem using the method.
Faultless communication
“Faultless communication” isn’t a misnomer exactly, but I think it lends itself to some easy misconceptions. The basic idea is that a sequence of examples is a faultless communication when there is only one possible rule describing all the examples; there is then the often-repeated statement that if a faultless communication fails, the problem is with the learner, not with the method.
When the book gets into details, however, the actual theory is much less dismissive. In fact, it is emphasized that in general, when a method fails, there’s something wrong with the method. A well-designed sequence of examples is not (usually) a faultless communication. Rather, it is a sequence of examples calibrated in such a way that, if the learner arrives at an incorrect rule, the test examples will identify the incorrect rule, which can then be traced back to an ambiguity in the examples given. Alternatively, it can make it clear that the learner lacks sufficient background to identify the correct rule.
The actual issue that the concept of faultless communication is meant to address is the following. When you don’t have a clear way to diagnose failure while teaching a concept, it leads to blind experimentation: you ask “Did everyone understand that?” and, upon a negative answer, say “Okay, let me try explaining it in some different way...” You might never stumble upon the reason that you are misunderstood, except by chance.
My own thoughts
A disclaimer: I have very little experience with teaching in general, and this is my first encounter with a complete theory of teaching. Parts of Direct Instruction feel overly restrictive to me; it seems that it doesn’t have much of a place for things like lecturing, for instance. Then again, a theory must be somewhat restrictive to be effective; unless the intuitive way I would teach something is already magically the optimal way, the theory is no good unless it prevents me from doing something I would otherwise do.
An interesting aspect of Direct Instruction that I don’t think has been pointed out yet (well, the book, written in 1982, might not be a likely place to find such a thought): this method of teaching seems ideally suited for teaching an Artificial Intelligence. Part of the gimmick of Direct Instruction is that it tries, as much as possible, not to make assumptions about what sort of things will be obvious to the learner. Granted, a lot of the internal structure still relies on experimental data gathered from human learners, but if we’re creating an AI, it’s a lot easier to program in a set of fundamental responses describing the way it should learn inductively, than to program in the concept of “red” or “faster than” by hand.
I still have the book and plan to hold on to it for a week or so; if there are any questions about what Direct Instruction is or is not, ask them in the comments and I will do my best to figure out what the theory says one way or the other.
- Review: Michel Thomas French (Direct Instruction) by 22 Sep 2011 8:34 UTC; 40 points) (
- Scientifically optimizing education: Hard problem, or solved problem? Introducing the Theory of Direct Instruction by 31 Aug 2011 5:28 UTC; 22 points) (
- 2 Nov 2011 13:47 UTC; 1 point) 's comment on Less Wrong link exchange by (
Misha, you are spectacularly awesome. =D
I mean, it’s aggravating to see things you wrote and go, “But I SAID that! Was everyone just skimming over that part or what?”, but as the aphorism runs in the DI world, “If the learner hasn’t learned, the teacher hasn’t taught”, eh? :P
[And until one sees that aphorism as perfectly consistent with “logically faultless communication”, one must know that one still hasn’t understood the meaning of the technical term.]
I knew I’d make terribly stupid mistakes in miscommunicating this stuff when I started, so I figured it was time to let go of my fear of not having it be perfect in the first place and just start trying.
I should also make sure, when you say it was 1982, do you mean original publication, or that of the copy you got? The second (and most recent) edition is 1991.
Dunno offhand what’s different. Never saw the older one myself.
You should have given some examples of things that are direct instruction and some that are not, and let us figure out what it was for ourselves! :p
Ha ha. See here.
Yeah, no. I can see not providing examples of everything you talked about, and generally not following your own preferred method to the letter. But the picture Misha has given me of DI would have told you to provide clear positive and negative examples of something within about the first full screen of text. I think I looked at three screens’ worth before giving up.
Indeed.
The reason I did not, rightly or wrongly, was because you have to start off doing this by showing how it applies in the most basic context, like in the AthabascaU module.
This results in a very technical analysis of something that initially seems trivial and pointlessly detailed, and unrelated to the amazing-looking results from studies like Project Follow-Through (which, remember, the meta-analysis says are representative).
I remember glazing over that section in the AthabascaU module myself the first time I read it. And several times after that. Only my emotional experience with the Michel Thomas lessons made me keep focusing on it until it clicked. Way later.
Now, many people on LW surely have much quicker intelligences for such things than I do.
But see the last paragraph of this comment from prase, in which he, at least, is having the same ‘this seems trivial and pointlessly detailed’ reaction after reading the AthabascaU module.
Why was he sticking with it? I believe because he had heard my emotional enthusiasm, and wanted to find out if I was just a crank, or if there was actually a rational reason for all that gushing “this thing is important!”
I believe that in the future, when detailed knowledge of DI has become a common thing on LW among people who never read my original post, some of those people will go back and read it, and go, “Huh? Makes perfect sense to me!” making it an excellent case study of how someone can have read Eliezer’s “Expecting Short Inferential Distances”, marked it in their mind as very true and very useful, studied DI theory, and still had to go and run smack into the brick wall, knowing explicitly that that was what they were doing, before truly emotionally understanding that, yes, it actually does apply to them.
Anecdotally, this post interested me in direct instruction; none of yours did. Going back and looking, I finally found (16 paragraphs into the “quick sketch of the basic theory” section, and 7 pages of text into the post) a sentence that hinted at the intriguing description in this post: “This is why I say that a huge part of the basics of DI is ‘guided-induction’ (my term, not used in the field).”
Remember that, inductively, every sentence I read without knowing what I’m reading about or becoming interested lowers my belief that I will eventually find out what I’m reading about and become interested. The “show, don’t tell” maxim in writing helps to defend against results like 7 pages of sharing your enthusiasm before giving any clue as to what distinguishes the subject of your enthusiasm from the closest 100 enthusiasm-gathering subjects.
At this point, I have nothing more detailed to respond to that than, “I am now extremely aware of that, but thank you for telling me again, because the extra repetition couldn’t hurt my chances of remembering to thoroughly apply it in the future.”
Sorry to f5 it, then—I just got the impression you were thinking inferential distance was the main problem.
Oh no, I know DAMN well I could’ve done WAY better if I’d been less stupid in the first place! Although if I had to communicate with my past self, I think the best thing I could have told him would be just to put a note at the beginning of the original post saying explicitly that it was a draft with many, many problems, but that I was pretty damn sure DI was a super-important topic to bring to the attention of LW, so if anyone would be so super-cool nice as to give me some feedback on how to make it more presentable...
There’s no way I could communicate the things I’ve learned so far to him more effectively than his resulting experience would teach him.
Uh, does this seem like an interesting idea?
You’re right, writing concisely is definitely a learned skill.
I became pretty good at it, but that’s only through practice and helpful editors at my college student newspaper and a couple of newspaper internships. If you want to improve your professional writing skills, find a place where you can practice and people will point out your flaws so you can improve. LessWrong can definitely serve that function.
Glad you have a thick skin, glad you could start a useful conversation, and hope to see more of you in the future!
I’ve often lamented the fact that colleges so frequently assign papers with excessive minimum page limits when they would better serve their students by applying restrictive maximum page limits. Instead of learning to appreciate conciseness as a virtue and a skill, students come away with the association that a piece of writing must be long to be respectable, a lesson which many, it seems, go on to apply in their careers.
Thank you, although it’s not so much the writing per se as the analysis of the precise structure of the inferential gaps that needed to be bridged.
And you’ll see lots more of me in the future. I honestly think a big part of the reason I got over my fear of it not being perfect and posted it already was because I’m very lonely, and the case study of the NYC rationalist chapter was the biggest carrot ever.
It’s not that I’m socially inept. Quite the opposite, when I apply myself. It’s just that I get so damn tired of… well, you know, the second paragraph sums it up perfectly already, doesn’t it?
As to having a thick skin, I was actually pretty depressed the first day I got up and saw the first batch of comments, which seemed very negative, like Alicorn’s.
“Pretty depressed” as in not able to keep myself from wondering whether my failure to just commit a nice painless suicide already due to my self-preservation instinct was essentially a form of akrasia. (Obviously, past issues exist, and I’ve been using my informal understanding of REBT to keep myself together, although I think I am “naturally” a very optimistic person.)
But I forced myself to confront the question and admit, as I always do, that I do care, and am going to keep on fighting no matter how impossible success seems or how much it seems that I always just end up getting hurt over and over again, so I may as well stop whining to myself and get back to work! So I cheered myself up.
And then I got home and saw that the situation was actually pretty damn good (had like 20 upvotes, and a couple very positive messages from a few individuals), so...
I don’t think I’m going to have a crisis of faith in “the light in the world” ever again.
If it makes you feel any better, I did read that part with considerable interest, and I understood how it related to your example of teaching the numbers 1-100, but I felt like it was touched on only briefly and the rest of the article was really long and pretty scattered, so I was left unsure whether the set of rules for choosing examples was DI, or one of the main things about DI, or just an example of why DI was awesome, or what.
I do think I might be able to make use of this. When I’m teaching a (usually high school-age) kid how to do math problems, I tend to use a series if examples like this:
Here’s a simple example of how to do this technique. Each step is mathematically valid using these rules you already know, and the point of doing it this way is that it gets you to your answer like this. Want to see another one? Ok, then let me switch it up...
Here are some trickier problems. If the problem looks weird in these particular ways, you can still use the technique by doing this. Otherwise, it’s basically like the first example.
If you’re going to screw it up, it’ll probably be like this or this. Please notice that this is not the same as the right way to do the problem. Also, a lot of people make this careless error. Make a checklist of these mistakes to look for in your work.
I guess DI would tell me to use positive examples that are as diverse as possible, and to avoid confusing examples where you can get a right answer by doing something other than the right process? Would you suggest anything else?
The first obvious thing that comes to mind is to learn to use task analysis. If you’re going to be working in an environment where the instruction hasn’t been designed to contradict misrules before the students develop them, you’re going to need to do lots of correction.
Remember that unless you actually get one of the DI programs like “Connecting Math Concepts”, anything you do will be just little chunks of big-DI fading off into little-DI at the edges. Doesn’t mean it won’t help you do better than average, but it’ll be way below what’s really possible.
Is that useful? Anything in there unclear? Like how to learn task analysis?
[Edit: sorry, you said “usually a high school age kid”. There’s no high school level “Connecting Math Concepts”]
Well, it would have been considerate if you’d told me what is meant by “task analysis” here, with an eye to enlightening me as to why I will want to use it. I can only infer from context that it will somehow make doing correction better or easier.
Oh no, I didn’t mean “Is that all you need?” as in “subtext: I’ve given you enough, go away”. :P
I meant: “I know I need to give you more information. Tell me where I should start.”
I linked to a scanned page of Theory of Instruction here in this comment thread
Please start by providing a definition—like, the kind you might find in a glossary—of “task analysis” as you are using the phrase in the above comment.
Ah! Sorry, I was thinking maybe you had understood some of the contents of that thread already before I mentioned it in this one.
Anyway, sorry this reply took so long. I was having scanner issues.
Here’s the first page of Chapter 12 in ToI, “Programs Derived from Tasks” [edit: fixed from accidental link to section of the AthabascaU module]. A definition of “Task Analysis” is, of course, under that heading.
There are details in the definition that rely on knowledge of concepts covered earlier in the book, but as a whole, does it help?
I just realized that page starts the heading “Strict Task Analysis” but I didn’t scan “Transformed Task Analysis” since that’s on the next page, and that’s what you need.
But honestly, it is reasonable of me to direct you to the book yourself, right? Rather than trying to write a ” Complete Guide to Task Analysis for Beginners!” right now?
Well, you didn’t define it in that thread either, as far as I can see, so I am confused by this statement.
In case this needs to be said: you really shouldn’t use jargon without defining it if you aim to write for beginners.
It is reasonable to quit whenever you decide it’s in your best interests to quit, of course. I’m sorry if you found my request for a definition onerous. I hope nothing I said seemed like a demand for a complete guide to anything; I didn’t intend it that way.
I may or may not ever get around to checking out the book from the UCF library. I was looking for more concrete and actionable pieces of advice on how to improve my teaching process, partly because they might be immediately useful, and partly because I am still undecided about whether DI has much to offer me and the quality/novelty of the advice would be significant evidence.
Anyway, thanks for your time.
ETA: The definition on the scanned page is sufficient, if not entirely transparent, so I upvoted you for answering my question. Thanks!
No no no! Please don’t mistake my tone! I am so happy that you’re asking me for detailed help with this! Responding to you is not onerous, but joyful!
Writing a “Complete Guide to Task Analysis for Beginners!” is something I’d love to do! I just know it won’t get done very soon.
I’m sorry I keep forgetting to examine my jargon that seems intuitively transparent to me and try to over-estimate how much explanation it needs. From now on I will start compiling a glossary of terms.
But yes, you raise a very important question:
“How much practical use can I get out of DI theory without actually studying it in depth?”
It is true that it is not like a magic item you can just put in your inventory and thereby receive extra points to your teaching ability, but an entire complex, well, theory, for engineering complex educational machines, which you have to understand and master the use of to create such machines yourself.
But still, there must be at least a few quick equivalents to things like pulleys and levers that I could distill for you.
The hardest part of that will be simply noticing what’s not already obvious to you...
How about if I submit the question to the DI community for you?
That is a doubleplus good idea.
Sure, sounds great.
Some other thoughts: perhaps you could give me some examples of specific teaching goals you have, and specific problems you often encounter?
Honestly, I suspect most of the problems high-school students have are due to lack of mastery of the basics. That they are weak enough on such things as adding/subtracting/multiplying/dividing fractions and working with exponents that they are likely to make mistakes on those even if they aren’t having their cognitive resources split between trying to track that shaky foundation and learn the details of the new thing you’re presenting to them.
If we could develop some systematic diagnoses, corrections, and practice materials (practice to mastery!) for just fractions and exponents, I think we might be able to hugely improve any tutoring you [or anyone else!] attempt.
ETA: If the lowest hanging fruit in improving your own skills is to “stop doing stupid shit”, then it follows that the lowest hanging fruit in improving your teaching is to figure out how to get your students to “stop doing stupid shit”. :P
Sure, I can see that would be helpful. Right now I have a bunch of SAT prep students, and I teach college kids calculus when there’s a demand, but for the sake of argument let’s consider Algebra II. One of the goals in Algebra II is to get the student comfortable with polynomials: factoring, multiplying and dividing them, and understanding the relationship between those processes and things like zeroes and asymptotes of functions. So maybe we should talk about factoring?
Nearly all my students get the hang of factoring polynomials once I can convince them to sit down and practice for a while (which presents its own set of difficulties), but I’m sure I’m not teaching it optimally. Problems I run into: confusion about which term in a quadratic comes from what (“It’s supposed to multiply to this and add to this, right? Or is it the other way around?”); neglecting to look for common factors first; confusion/frustration when the leading coefficient isn’t 1; not recognizing special cases like difference of squares (only sort of a special case), higher degree polynomials in quadratic form, or sum/difference of cubes; not knowing when to use factoring by grouping. I have my own ad hoc ways of dealing with these problems, but I have no reason to believe they’re the best possible.
Maybe this is still too broad, or I’m assuming too much familiarity with the subject matter? I’m just tossing it out.
I like this idea. I do pretty much re-teach how to use fractions (and to a lesser extent exponents) whenever they come up, but much as I would like not having to do that, I’m not sure the problem is easily solved. Kids don’t learn how to use fractions partly because they don’t believe they need to; they decide in elementary or middle school that “decimals are way better and you can use a calculator,” and once they’re in high school they find out about “Ans=>Frac” on their graphing calculators. In my experience they really, really resent being drilled on fractions, and forget what they’ve learned very quickly because they refuse to use it for anything else. Maybe I’m being too cynical, though.
I do think it would be very useful to put together a “quit making stupid mistakes” program—if I could get kids to stop making sign errors, or doing the wrong operation because they didn’t think about it for half a second, their test scores would probably soar—and indeed I’ve seen this happen with at least one student before, so I should probably try to implement it systematically.
Hmmm, there could be lots to reply to in that post, but I’ll try to keep it brief...
Can you give me a few specific examples of actual tasks that your students have problems with most commonly? Like, show me exactly what the students are presented with.
With that, I might be able to do a transformed task analysis, and develop an example cognitive routine.
Actually, factoring is used as one illustration of a cognitive routine in Theory of Instruction. I’ll scan that section when I get time.
Ok. Well, we were talking about factoring. Here’s a factoring problem I would not expect most of my students to get:
Edit: Sorry, I guess you wanted more than one example? Not sure whether these are supposed to all be examples of the same basic type of problem, or different, or what, but I added a couple more factoring problems.
Factor completely.
4x^2+11x-3
3x^3-13x^2-10x
3x^5-3x
2z^3+16
blinks
The last one might throw me off if I didn’t remember off hand that a^3+b^3 has a factoring—is there a reason one should be especially likely to mistake the others?
These are not intended to be hard factoring problems for LessWrong, as if there were such a thing. They’re hard factoring problems for most high school algebra students because none of them have a leading coefficient of 1, and because on all but the first you have to remember to look for common factors before they even look like something you can cope with. The first is mainly hard because most kids, when they even remember how to deal with the 4 at all, will try to factor it as (2x+c)(2x+c) and then give up.
Successfully factor inferential distance into how hard it will be to convince people in your life of the following:
Probability is subjective
Science and its method are a social model for humans to avoid believing in untrue things, not a logically correct way to discover what is most likely true.
Absence of evidence is evidence of absence (only applies for those who were or will be in college)
Good luck!
Its roots will be multiples of the three complex cube roots of unity, so it can’t be factored in the reals.
Maybe the “completely” is inappropriate here? In my defense, that’s what textbooks at this level usually say when they mean “factor as far as possible using real, integer coefficients”.
All I was looking for was 2(z+2)(z^2-2z+4).
Problem is that the DI world, in terms of the actual experts on the theory rather than just people who deliver programs, is very small, and most of those experts work together in person rather than communicating online.
So it might take a while to get a response.
Heh, I actually just realized that I’ve been using some non-transparent LessWrong jargon in some of my communications with the DI community, like “inferential distance”.
The problem is that, once you understand the concepts common both on LW and in DI theory, there is so much overlap in meaning that it takes a little bit of conscious thought to remember which way of expressing an idea is appropriate in which context.
[I mean getting the context of LW and DI people confused, of course. In the context of individual sentences, it’s obvious which is most apt, hence why I need to stop myself from switching back and forth without thinking about it.]
My edition is 1982 (the library didn’t have any others). It doesn’t seem too different—in particular, page 143 in my edition is identical to the page 143 that you scanned and posted (which means, among other things, that there wasn’t anything added or removed to the first 143 pages; at most, there are changes in wording). Perhaps it’s just a reprint.
That makes lots of sense. I know that the original publisher folded and that’s why they had to switch to publishing it through the Association for Direct Instruction directly, but I didn’t know whether they updated it at all beyond the preface at the same time.
The book is honestly full of little typos, so I doubted they’d edited it again anyway. I’ve been taking notes of things I think belong on an errata sheet myself.
Thank you for following through with this! It’s super awesome of you to take this on, then actually follow up and do what you wanted, in the timeframe you planned. This post was very good at concisely saying what it is.
Just to check that I understood it, Direct Instruction is about presenting a sequence of examples of what does and doesn’t fit a concept, geared towards making sure that the most common false ideas are falsified, and then testing with similar ideas to check for retention/comprehension?
The examples don’t have to be binary ones, those are just the easiest ones to describe and the most common. If you were teaching addition, your examples would be (more or less) addition problems, but they would still follow the same rules, with some modifications (for instance, negative examples don’t really make sense in the context of “2+3=?”).
But basically you have the right idea. Another thing I didn’t touch on in the above post is that the testing examples seem to serve a teaching role, as well. I’ve even seen example sequences in which all negative examples are left until the testing, though I haven’t read carefully enough to be able to say when one is supposed to do this.
Ah, yup. Also heard about this effect with spaced repetition.
That seems like an interesting case.
Okay, so looking into it further, this sometimes happens when teaching a “noun” concept (that is, a basic concept with multiple defining qualities and possibly a fuzzy boundary). The text has this to say about the matter:
There are multiple examples of noun sequences:
Learning the concept of a “truck”, distinguishing from already-known concepts of “car”, “bus”, and “train”. Three truck examples are given, then testing begins; no negative examples.
Learning the letter “b”, distinguishing from several already-known letters. Only two examples (“b” in two fonts) are given before the test examples, because the concept has a narrow range; no negative examples.
Identifying a “black oak” leaf, distinguishing it from other leaves. A “white oak” leaf is also shown in the test examples, because the two look very similar.
Learning the category “vehicle”, when various types of vehicles are known. It has negative examples of “swing” and “lawnmower” to demonstrate boundary cases, before the testing examples begin.
I’m surprised I didn’t make the connection between Direct Instruction and spaced repetition earlier. A lot of the theory of DI easily translates to making better spaced repetition tests.
I think it would be good to include also non-examples of “d”, “p” and “q”.
Generally, I think that any explanation should include non-examples, to show the boundaries of the concept. Otherwise you did not disprove the hypothesis that “anything is a valid example”.
My intuition about DI is that you give a few examples and non-examples such that an Occam’s razor will lead student to the correct explanation. Or in other words, “faultless communication” is one where the correct interpretation of teacher’s words has lower (preferably: much lower) Kolmogorov complexity than any incorrect interpretation.
One of the rules for nouns is that the negative examples you use (in the whole sequence, including testing) are ones the learner already knows. In this case, I think that, because there is such a narrow range of variation in letters, they felt like the already-known “d”, “p”, and “q” could be saved for the test examples.
I personally think it wouldn’t hurt to mention them before the testing examples, too, and this seems like something open to interpretation.
Seems like it. Now I understand why Richardson was comparing it to Zendo.
Or the 2-4-6 game ‘reversed’, yes. Before Misha’s post, I was actually going to try and straighten out the confusion over “logically faultless communication” by showing how it would apply to the 2-4-6 game ‘reversed’ as an example. I might still, depending.
Although the thing is that I’d say the best way to communicate something like ‘2-4-6’ isn’t as one of the simpler concepts in the hierarchy, but as a ‘cognitive routine’, which is made of a chain (possibly branching) of various simpler concepts that have already been taught.
As Misha said:
(Which doesn’t even get into the hidden complexity of all parts ‘black boxed’ together in the sentence as assumed already taught)
Of course it is easier to explain “logically faultless communication” by showing how it applies to more basic concepts, than to complex concepts that are made up of many of those basic concepts connected together.
The problem is that when you just show the very basic concepts as the AthabascaU module on DI does, people say stuff like this:
(Prase, in comment on first DI post.)
I anticipated this, and had tried to avoid it by injecting a little excitement, like: ‘Hey y’all, here’s something extremely valuable but complex and non-obvious. It will seem confusing and/or trivial at first, but it really is valuable!’
And in actual fact, looking back, that probably did help, because I got plenty of people going, ‘What the hell are you talking about? Give us some meat!’ rather than just, ‘Huh. Whatever.’
I’m sure there was a much better way possible of achieving the same goal, but what were the chances of me ever finding it without any feedback on actual attempts?
Editors. You are not the first person I’ve seen who could really use an editor on their material.
Editors are feedback. The only place I was gonna find suitable editors was LW. Rather than PMing people, I posted it to the section called “discussion”.
For heaven’s sake, what else was I supposed to do?! :P
A very specific kind of feedback, with understood social roles and norms of interaction; it is a ‘kind’ of person, which you can solicit for. If you cannot think how else to do it because PMing random LWers (a perfectly valid way! Just ask politely and take no for an answer.), there are many other ways. Just off the top of my head:
Fanfiction has an ancient tradition of ‘beta’ readers, and well-developed mechanisms for soliciting beta readers.
General and literary writers have an even more ancient tradition of workshops and reviewing, which has migrated online; I seriously considered submitting my own material to one of the most active online editing communities, Critters Workshop.
University students (aren’t you one?) have free access to their fellow students (I helped out one such friend extensively with his writing), and more importantly, they have unlimited access to ‘writing centers’ or other such establishments on campus, whose job is helping students out with editing and reviewing their reports, essays, etc. (I’ve talked to a few people who work in them; they complain about how few students ever turn to them. The ESL students sometimes make heavy use of them, but no one else.)
And this isn’t even thinking outside the box a little; for example, I bet you could find plenty of editors/reviewers if you posted a Bitcoin advertisement.
(Also, little in your original post was LW-specific. Any intelligent college grad, for example, could offer good feedback on that. It’s not like you were discussing the finer details of UDT or something.)
No, I’m on a full time internship as an elementary teacher. The theory I’m studying by myself.
I’m surrounded by people who know how to deliver DI programs, how to do superb classroom management, etc, but not by anyone who could read Theory of Instruction in one weekend and write a report on it (How many posts of lorem ipsum would Misha have to make for me to upvote for the karma system to accurately reflect the props he deserves :P).
But yeah, “Just Try It”, right?
And it’s the LW community that I have a strong emotional desire to get involved in, and the eventual intended audience anyway.
I’m not sure that advice is very good when the consequences of failure apparently made you contemplate suicide.
And when your post is sufficiently bad that a single editor could have diagnosed most of the problems, it’s a little contemptuous of peoples’ time to—in effect—run it simultaneously past dozens/hundreds of editors.
I wasn’t contemplating suicide per se, I was just wondering seriously whether the reason I wasn’t was essentially a form of akrasia.
And I have major past issues. I have an intellectual belief that I will one day be able to share true mutual understanding, love, and trust with other people. Relationships that make life worthwhile.
However, due to some traumatic experiences with people I thought I had that with just completely disappearing, even though I can see in retrospect how to discriminate such unreliable people from the real deal, it takes a lot of mental pumping for me to keep up the corresponding emotional belief.
LessWrong is my HOPE. A community that makes me feel that I could expect to find that understanding, love, and trust.
So I certainly wasn’t being “contemptuous of peoples’ time” on purpose, and given that my original post had 20 upvotes at one time, I don’t think the community feels that way either.
If you understand where I’m coming from now and have changed your mind that I did make the right choice, please tell me that.
I stand by my comments; DI is interesting enough that even a badly written post is still a net benefit which could get 20 net votes. However, it would have been much better all around—for you and the readers both—if you had gotten someone to edit it, in any of the 3 or 5 ways I’ve outlined. There’s no reason a well-done DI article couldn’t be promoted to the main page, eg.
(In a related issue, I am also bothered that you apparently could not think of any way to get editors. You should work on that.)
I… this is one of those issues that if I am in the wrong, I will have to take a break and apply some more intense techniques for getting around my own defensiveness than I usually need to use.
But I really honestly feel no “small note of discord” in my mind that should make me expect to find that I am wrong.
At any rate, since it’s over and done with now, what say the both of us just put the issue far in the back of our minds to allow any potentially useful new thoughts to crystallize by themselves, and refocus our attention on the future of what we need to write about DI?
So you don’t care that most of the reaction to your article was about how it was written? You don’t care about how much time you’ve spent discussing it with me alone? (Or how much time I’ve spent, hoping that future material will be better?) You don’t care about how much the impact was muted because of all that? You don’t care about what you’ve learned about the value of clear writing? You don’t care about building a reputation as a guy who knows about something interesting but can’t write for beans? You don’t care about your apparent ignorance of editing done either by yourself or another, or how to get it, or that you were ignorant of being ignorant, or that you might be generally miscalibrated about your competence? You don’t care about sending the message that you don’t care about all the foregoing?
I’m not asking for a large note of discord, but I definitely think there should be a small note there somewhere.
I’d rather discuss you. DI is just one topic, and hopefully just the first of many topics you might discuss here. Someone else will sooner or later pick up the DI baton, but if you ignore any lessons to be learned here, when will you learn them? Sooner would be better than later.
After rereading your last comment here, I just wanted to make clear: I do care very much.
Thank you making an excellent, explicit, compressed list of everything I did wrong. (...Where else than LW would that be obviously non-sarcastic? :P)
It is very valuable and I will be using it to improve. If I had a printer, I’d print it out and put it by my computer. (As it is I’ll just have to save it to a file I use a lot.)
I’m going to the effort of telling you this because, due to the value of the comment, I want to encourage similar feedback from you in the future.
...And if I had never made the attempt, I never would have gotten that feedback from you either. :P
Yes, I realize that you can’t just say, “Well, it’s all right, cuz I learned a valuable lesson!” and then keep doing the same dumb thing, but… whenever I have done a dumb thing, it’s cuz I haven’t learned that lesson yet! So long as you’re not twisting that into an excuse to keep doing the dumb thing, or to avoid trying to learn faster not to do different dumb things in the future, you’ve got it right, right?
So I still think “Just Try It” was good advice.
I’ll be sure to criticize you in the future, then.
Precisely. To err is human, to persevere is of the devil, or however it goes.
Most excellent Gwern!
I have a proposition!
So, I’ve begun writing a new post, “A dry presentation of some empirical evidence on DI’s effectiveness”. (An attempt to replace that intended function of my original post with as high-quality a replacement as Misha’s post was for the intended function of the ‘theory sketch’ section.)
KPier very kindly offered to help me with editing, so I sent her the first seven-ish paragraphs I had written. She found one change to recommend, somewhat ambivalent herself over which way was best. I wasn’t sure either, and found myself wondering what I’d decide in the end.
Then I started wondering about what differences in responses there might be between a post where she made all the final decisions, and a post where where I did.
And then I thought… double-blind experiment! (Woot woot, raise the empirical roof. :P)
Here’s my idea:
I finish writing the post, get the ‘her final’ and ‘my final’ versions, and then make a post linking to both versions and explaining the experiment.
I’ll just label them version A and version B (flip a coin to avoid any weird bias I may have on As and Bs, not that I’d anticipate much), and ask the reader to follow one or the other (by flipping a coin to avoid any weird bias they may have; Mostly just to make sure the sample sizes for each version are equalized.)
Then people record their impression and give me their feedback (without directly quoting the text), and I have to try and discriminate which readers got which.
Does that sound like a neat idea? If it works well, it seems like it might even end up being worth creating an automated system for setting up and running such experiments (without all the coin flipping and link following), for people to use with appropriate posts.
Luke did such a test recently. It’s probably useful for feedback (right now, his two version are at 20 and 3 karma), but really annoying for commenters. I would recommend getting some beta testers instead (I volunteer). Even a small sample of readers should be able to catch most relevant problems.
Thanks! I did think it sounded annoying for commenter, and I don’t want to try the general audience’s patience much further at this point. Hence why I’m just asking a few people what they think of it in the comments.
Being able to calibrate myself objectively is an extremely attractive idea, though.
It’s been done before, but not often, so I infer it doesn’t work well. Possibly this is just due to clumsy implementation.
And I’ll be sure to strive to make your job much smaller. =]
I appreciate your willingness to have an in-depth discussion of this topic with me, and if I had infinite time, I would gladly take you up on it. But since I don’t, I’m sorry, but I’m going to have to bow out of discussing the subject of me in order to have more time for the subject of DI.
I have already learned lots of things I think I could apply to better accomplishing similar goals in the future. And I don’t anticipate having to introduce another such wide, deep, complex topic as DI from scratch again.
Again, thank you, and I hope to see more of you in the DI discussion.
You know, it’s also possible to learn a totally different lesson here: We should be gentle with people, because they are often much more vulnerable than we assume. I would argue that this lesson is much more generally applicable than “Be more scared of failure.”
It’s far easier to get the person with vivid memories of contemplating suicide to be more cautious than to get everyone in general to be more gentle.
Well, that’s certainly a good point, but I wasn’t talking to everyone in general, either.
Please see “I wasn’t contemplating suicide per se”. I knew in advance that I would decide to keep fighting, as I always do. It is actually a technique I use to cheer myself up, rather like being underwater, and dipping down just a bit so that you can kick off the ground in order to spring back to the surface.
I certainly know I’m very careful thanks to my own experiences to avoid causing unnecessary pain to others, and in fact to try very hard to make people happier.
But I am not fragile. I hurt, but I never break.
I thanked people for their harsh criticism, remember.
.
Is there any reason that this isn’t a front page post?
I didn’t even think about putting it there because the other posts I was following up on were in discussion. I’m pretty sure I could move it now if there is support for the move.
I think a front-page post would need to give more context and cover a bit more material but I can probably do that, too.
I support moving it.
Perhaps you could work with Owen? He had some additional material.
I would love to help, of course. Right exactly now I have some stuff from my internship I should bump to the top of my priority list (there were some minor problems last week with the kindies not following the proper procedures for asking for my help when I’m helping someone else, and I need to whip up a short script to model the expectations with the other teachers—Honestly, a huge percentage of behavior problems with kids, especially the youngest ones, are just from them honestly not knowing what you want from them).
But that kind of project is nowhere near as demanding as this, so yeah. Please do use me as a resource. I’m a lot easier to understand in a real-time discussion format.
But I think we might want to hold off on putting this on the main page for a little while, although that might just be because I’m so familiar with the depths of how far DI goes that even a very deep introductory piece seems shallow to me. It’s everyone else’s decision, obviously, since people who aren’t super-familiar with it are the audience.
Good article, upvoted! I’ll definitely try that out with teaching chess (where DI methods might be especially hard to apply because there are no hard boundaries for examples, but I’ll try anyway).
Compare the post The 5-Second-Level, where we also talked about useful heuristics for teaching things (I didn’t read through the entire discussion over again, but I do find that this bit is remarkably similar to what DI seems to be about.
Yeah, the LW community should find lots of things in DI that seem remarkably familiar once understood...
Project Follow Through, the study most frequently cited as proving the benefits of Direct Instruction is far from perfect. Neither classrooms nor schools, were randomly assigned to curricula. Its not clear how students ended up in treatment vs. comparison groups but it probably happened differently in different communities. See http://en.wikipedia.org/wiki/Project_Follow_Through#Analytical_methods for a bunch of info and more references.
Yes, Project Follow-Through had some problems, but I don’t think it’s likely that those problems provided a systematic bias towards DI sufficient to explain away the huge differences as non-significant, especially since similar results were replicated in many smaller studies that were in a situation where better random assignment etc was possible.
“Research on Direct Instruction” (Adams and Engelmann, 1996) goes into much better detail on Follow-Through and those other experiments.
Actually, it basically covers three different types of studies:
Those dealing with the relative effectiveness of DI compared to other models (in a meta-analysis)
Those pinning down the internal details of DI theory, validating unique predictions it makes (about the effect specific variations in sequencing, juxtaposition, wording, pacing, etc should have on student performance). Only one prediction ever came out differently than expected: That a sequence of examples starting with negatives would be more efficient at narrowing in on a concept for the learner. It was found that while this did hold with more sophisticated older learners, more naive younger students simply interpreted the, ‘This is not [whatever]’ to mean, ‘This is not important, so don’t attend to this’.
Those demonstrating ‘non-normative’ outcomes. For instance, calling Piagetian developmental theory into question.
You should be able to find the book at a local university library. Could you get your hands on it? I’d love to hear what you think after reading it!
DI sounds like Zendo. I wonder how you could use Zendo in school. When I think of things that people learn (history, vocabulary, spelling, arithmetic, painting, dancing, musical instruments, basketball, anatomy), not one thing comes to mind that could be taught this way.
Yes, this post is just an introduction to the very basics of DI (technically to just the very basics of one half of it, the ‘stimulus-locus analysis’).
Theory of Instruction goes into detail on those fundamental principles and how they apply to teaching the most basic concepts. It then shows how the basic concepts can be built up into more complex ones, and therefore how more complex ones can be analyzed to reduce them into parts for teaching.
Once you understand the details, you’ll probably just say, “Oh, right, reductionism. Of course that also applies here.”
Excellent!
Idea I got that might be useful: somehting like a “top 10 heuristics for bridging inferential gaps quickly” using a similar approach but generateable in realtime.
This is the “20 questions” or “animals” program, which used to be standard in /usr/bin/games on all Unix systems. It isn’t of much value in practice.
That seems like the inverse problem. With the 20 questions game, your AI learns miscellaneous qualities of a whole lot of objects. With this method, your AI learns examples of a quality that it can, if the algorithm works, use to recognize the quality in other examples.
If it works, anyway.
DI is a theory of instruction, not of learning.
If you’re interested in judging in greater detail how DI might offer any ideas on AI that are both useful and original, the place to start would be Inferred Functions of Performance and Learning (Engelmann and Steely, 2004), which does attempt to set out a theory of learning (and the logically necessary things that must be going on inside of any system that performs a given behavior, whether learned or unlearned).
Please see this comment.
Actually, it just occurred to me, when you said:
Were you one of the people I explicitly pointed that connection out to, or did you have the opportunity to notice it for yourself?
Yeah, I wasn’t gonna mention this for ages, but the book Inferred Functions of Performance and Learning by Siegfried Engelmann and Donald Steely might contain some useful original ideas relevant to Artificial Intelligence, but I haven’t read it myself and really have no idea beyond “sounds plausible”.
(That is, I know I’ve been communicating very high certainty that DI is a very big deal when it comes to education, and I’m afraid some may have decided I have a general ‘having very high certainty that things are big deals’ trait, and thus misinterpret this recommendation as far stronger than it’s intended.)
But Zig himself has a short description of what the book’s supposed to be about here, so you might be able to come to a better conclusion yourself just by reading that.