Polaris, Five-Second Versions, and Thought Lengths
Author’s note: During CFAR’s 4.5d workshops, concepts that had been formalized as “techniques,” and which could be described as algorithms and practiced in isolation, generally received 60+ minute sessions. Important concepts which did not have direct practical application, or which had not been fully pinned down, were often instead taught as 20-minute “flash classes.” The idea was that some things are well worth planting as seeds, even if there was not room in the workshop to water and grow them. There were some 30 or 40 flash classes taught at various workshops over the years; the most important dozen or so make up the next few entries in this sequence.
Polaris
Imagine the following three dichotomies:
A high school student mechanically following the quadratic formula, step by step, versus a mathematician who has a deep and nuanced understanding of what the quadratic formula is doing, and uses it because it’s what obviously makes sense
A novice dancer working on memorizing the specific steps of a particular dance, versus a novice who lets the music flow through them and tries to capture the spirit
A language student working on memorizing the rules of grammar and conjugation, versus one who gesticulates abundantly and patches together lots of little idioms and bits of vocabulary to get their points across
By now, you should have a set of concepts that help you describe the common threads between these three stories. You can point at goal factoring and turbocharging, and recognize ways in which the first person in each example is sort of missing the point. Those first three people, as described, are following the rules sort of just because—they’re doing what they’re supposed to do, because they’re supposed to do it, without ever pausing to ask who’s doing the supposing, and why. The latter three, on the other hand, are moved by the essence of the thing, and to the extent that they’re following a script, it’s because they see it as a useful tool, not that they feel constrained by it.
How does this apply to a rationality workshop?
Imagine you’re tutoring someone in one of the techniques—say, TAPs—and they interrupt to ask “Wait, what was step three? I can’t remember what came next,” and you realize that you don’t remember step three, either. What do you do?
You could give up, and just leave them with an incomplete version of the technique.
You could look back through the workbook, and attempt to piece together something that makes sense from bullet points that don’t really resonate with your memory of the class.
Or you could just take a broader perspective on the situation, and try to do the sensible thing. What seems like a potentially useful next question to ask? Which potential pathways look fruitful? What step three would you invent, if you were coming up with TAPs on your own, for the first time?
The basic CFAR algorithms—like the steps of a dance or the particulars of the quadratic formula—are often helpful. But they can become a crutch or a hindrance if you stick to them too closely, or follow them blindly even where they don’t seem quite right. The goal is to develop a general ability to solve problems and think strategically—ideally, you’ll use the specific, outlined steps less and less as you gain fluency and expertise. It can be valuable to start training that mindset now, even though you may not feel confident in the techniques yet.
You can think of this process as keeping Polaris in sight. There should be some sort of guiding light, some sort of known overall objective that serves as a check for whether or not you’re still pointed in the right direction. In the case of applied rationality, Polaris is not rigid, algorithmic proficiency, but a fluid and flexible awareness of all sorts of tools and techniques that mix and match and combine in whatever way you need them to.
Or, in other words: you’re here to solve your problems and achieve your goals. Everything else in this sequence is useful only insofar as it helps with that.
Five-Second Versions
“Using a CFAR technique” often doesn’t mean taking out pen and paper and spending minutes or hours going through all of the steps. Instead, it involves five seconds of thought, on the fly, when a relevant situation arises.
Examples:
Murphyjitsu: You agree to meet a friend for coffee, and quickly run the plan through your inner simulator before ending the conversation. You hastily add “Wait! Let me make sure I have your phone number.”
Internal double crux: You notice that you don’t feel an urge to work on this email that you’re supposed to send. You spend a second to visualize: if you become the sort of person who does feel such an urge, will something positive result?
Goal factoring: You notice that you’re feeling tension between two possible outings over the weekend. You quickly identify the best thing about each, and see whether one can incorporate the other.
Aversion factoring: You keep feeling bad about never getting around to reading dense nonfiction books. You consider whether System 1 may be right here; perhaps it really isn’t worth the trouble to read them?
TAPs: You’re two minutes late for a meeting, and think about what trigger could cause you to leave five minutes earlier in the future.
Systemization: You feel a vague annoyance as you’re sorting through your pantry, looking for the chips, and you decide to move the bag of rice all the way to the back, where you can still reach it over smaller, more frequently used items.
Note that these five-second versions often only use a fragment of the technique (such as checking whether an aversion is well-calibrated), rather than thoroughly applying every step.
Some advantages of using five-second versions:
You can use them more often, at the moment when they’re relevant, without having to “boot up” an effortful, time-consuming mode of thinking (many CFAR instructors use these something like twenty times per day).
You can integrate them fluidly with your thinking, rather than having to interrupt your flow and remember what thoughts or activities to return to.
You can practice them many, many times.
You can develop multiple variations, including your own independent inventions.
Most importantly of all: if a larger, more effortful version of a technique is something you simply will not do, then a five-second version you will do is infinitely better than nothing.
Thought Lengths: The Ray Model
If I say “Hi, how are you?” and you live in white middle class America, you’ll almost certainly say something resembling “Pretty good, you?” If I ask something like “What’s happened this week that you’ll remember five years from now?” I’ll get a response that’s a lot less predictable, but it’ll most likely be made out of words that I at least sort of understand.
There’s a lot going on in the space between question and answer, and thanks to the work of generations of psychologists and neuroscientists (and a few unlucky souls with iron rods through their brains and so forth), we’re getting closer and closer to having some clear/workable/reliable causal models.
We don’t have them yet, though, and while we’re waiting, it’s interesting to see what we can accomplish if we don’t even try. Call it a black box, and treat humans as complicated input/output devices with a whole bunch of levers and knobs—a stimulus goes in, some stuff happens under the hood, and a response comes out.
Imagine the stimulus/response pattern as a ray or vector, and your mind as a surface. The external, sensory universe is everything above the surface, and the internal, cognitive universe is everything below. Something—say, a question—sparks a line of thought, and that line of thought leads to something else—like an answer.
If the stimulus/response doesn’t take very long (it’s an easy question, or a familiar motion like catching a tossed ball, or a visceral response like one’s reaction to a strong smell), then in our model the line will be short, as will be the distance between the input and the output.
If, on the other hand, there’s significant processing involved, then we can imagine a much longer line, and a greater distance between input and output:
“Let’s see, 50 x 30 would be 1500, so 47 x 30 would be three thirties less than that, or 1410, and then we need to add a couple of forty-sevens, so … 1410 + 94, which is 1504. I’m like … ninety percent confident, there?”
In the example above, the thought process is fairly straightforward (at least for people who are comfortable with mental math). Once you’ve picked a strategy, it’s mostly just churning away until the calculation is complete.
There are plenty of stimuli, though, that don’t cause a straight march from stimulus to response, but instead send us all over our own minds, activating a large number of concepts and processes before finally cashing out to some new conclusion or action:
And furthermore, there isn’t always a single line. Sometimes, the same stimulus can spark multiple threads of thought, each of which will have its own length and path.
It’s also kind of fun to imagine what happens when things get subconscious, such as when we find ourselves making connections or entering emotional states that we can’t fully explain or justify. It’s pretty easy to imagine a second, deeper, opaque-ish surface that represents the limit of what we can “see” with our metacognition, but we’ll hold off on that for now, lest we summon the ogres.
Astute participants may be thinking “isn’t this just System 1 and System 2 again?” and there is certainly a lot of overlap with that model (which is another wrong-but-useful approximation).
However, where S1 and S2 are discrete (or at least discrete-ish), this model instead treats the range of possible thoughts as continuous. There is no single bucket for “short thoughts” that is distinct from a single bucket for “long thoughts.” Instead, all thoughts are treated as the same basic sort of thing: some amount of below-the-surface processing, ending in an output.
What makes this model interesting from an applied rationality perspective is that it raises the question of whether a given thought is an appropriate length.
Some thoughts are too short, and need to be lengthened, and some CFAR techniques can be thought of as designed to do precisely that. Think of goal factoring and focusing, for instance, which take flinches and decisions that might otherwise be somewhat knee-jerk, and slows down and fleshes out and expands them, allowing for more processing before a final output.
Other thoughts are too long, and need to be shortened. CFAR has fewer named techniques in this arena, but TAPs and CoZE both play in this space, as well as Resolve Cycles. The whole concept of policy-level decisionmaking is similarly a thought-shortening frame—the idea being to set a policy so that future instances of a given problem or scenario can be addressed quickly and without a lot of meandering.
It’s interesting to ask oneself the question “Where do I go wrong because I put in too little thought, and arrive at my outputs too quickly?” and it’s worth asking the mirror question “where am I spending too much time and attention, and should instead be working to shorten the processing time between stimulus and response?”
Some stimulus/response patterns that tend to be too short for many people:
Sudden changes in plans, which cause them to grumble and grouse even if the new plan is better
Unanticipated requests for time or energy, which often lead people to overcommit and make promises they start to regret later
Rounding-off, in which people halo-effect or horns-effect other people, plans, or activities, losing opportunities to factor or mix and match.
CFAR canon has a handful of techniques that are good at increasing the distance between input and output, and once you get “some thoughts are shorter than they ought to be” into your head as an organizing principle, you may find yourself reaching for those techniques more frequently and more appropriately.
Conversely, some stimulus/response patterns that tend to be too long:
The amount of psyching up that people often have to do before performing some task, especially a challenging physical one (like a round of pushups or a complex move like a backflip)
Rumination loops on decisions and consequences that are firmly in the past and have no further lessons for you to learn
Decision paralysis, where the expected value of further investigation or weighing-of-the-options is far smaller than the cost in time and attention
…and again, there are techniques that can help. Being able to think “oh, this is a line of reasoning that I should be able to skip to the end of, or at least cache somehow once I finish, so that I can simply call it back up and don’t have to rederive it every time,” has been a big net positive for many people.
Finally (though this is a small benefit), the simple visual metaphor of moving the exit point for a given thought can help with things like non-useful emotional triggering during intense conversation. The above model has helped some CFAR staff recognize certain … golf holes? Geysers? Lava tubes? … where their thoughts tend to drift, and given them a clear way to evaluate potential replacements (“Is this new kind of ‘answer’ sufficiently far enough from my old habits and reflexes that I won’t just slide right back into my previous ingrained behavior?”).
It’s neat that this model post-dicts a lot of things that make sense for entirely different reasons (such as slowly counting to ten before speaking, or rehearsing a given mental process until it becomes easy). As far as “tools you could teach a ten-year-old” go, we posit that this one has a lot of potential in terms of its sensibility and versatility.
As the person who introduced CFAR to PCK and who created the “Seeking PCK” class, I want to add two comments.
First, Duncan, this is a dynamite write-up. I like how you even dug up the detail of where the terms “partitive” and “quotitive” came from. I actually didn’t know that! It’s kind of obvious in retrospect.
Second, when I was at CFAR this class was a full 1+ hour class, usually on the first night. I pulled from my own experiences with clinical interviews to create an exercise that seemed to stick with quite a few participants. I’d write 24–16=12 (in vertical form, like from an American elementary subtraction algorithm) on the board, pointing out that “the student made a mistake” didn’t actually result in understanding what the student did. Rather, it described what the student didn’t do.
Then I’d inform them (truthfully) that I was actually running an algorithm I’d seen in my clinical interview days. (My Ph.D. is in math education.) I pointed out that if they actually understood what the student was thinking, they should be able to predict how they would perform on other similar problems.
So from that point I’d invite the whole group to toss out example problems they’d want to give this student in order to probe their understanding. We’d do them in two batches: I’d ask for problems, give everyone a chance to make predictions about how the student would answer each one, and then I’d go through and answer them. Then we’d do it one more time.
Lots of people assume the student was just doing columnwise subtraction of the larger number from the smaller number, ignoring order. So usually it kind of blows everyone’s mind when they see something like 53–25=25.
Often people will object that the student “isn’t being consistent” — which, again, describes what the student isn’t doing rather than what they are doing.
It’s sort of forehead-slapping when I reveal the algorithm. Suddenly most of the data makes tremendous sense. (I say “most” because sometimes the kid had to be creative when their method didn’t apply directly to the example.) I thought it was a great demo in understanding others’ minds, asking relevant questions, and epistemic humility. If I remember right, it was an example that stuck with a lot of participants too.
I guess after I left CFAR, the crew stopped using this exercise. I’m honestly not sure why. Based on Duncan’s intro here, I’m guessing they just felt it wasn’t as central a point for participants to experience as other things, and they figured the 80⁄20 here was just pointing out the existence of PCK.
(The connection, in my mind, is that the kind of precise curiosity needed to navigate the puzzle is exactly of the type needed to gather PCK.)
But really, I don’t know.
I just figure folk here would like to know a bit of the history of that tidbit.
Ah, and speaking of history, one other detail about PCK y’all might like to know:
I introduced the idea to CFAR back in the summer of 2012 during an internal colloquium talk series. It helped define a lot of how we thought about unit creation thereafter. We talked explicitly about PCK for various units, and for running the workshop as a whole, for the rest of the time I was there.
The “Seeking PCK” unit didn’t exist until quite a bit later. I don’t remember when we introduced it honestly. Maybe 2016?
But the basic idea is part of CFAR’s memetic DNA at this point. We even talked explicitly about how to transfer PCK between people when we were handing a unit off to a new teacher.
Seeking PCK was a full (hour or longer) class at every mainline workshop since October 2016 (sometimes called “Seeking Sensibility” or “Seeking Sense”). After you left it was always a full hour+ class, almost always taught by Luke, and often on opening night.
The concept of PCK became part of the workshop content in April 2014 as a flash class (as a lead-in to the tutoring wheel, which was also introduced at that workshop). In October 2016 we added the full class, and then a couple workshops later we removed the flash class from the workshop. Something very close to this chapter made it into the first draft of the CFAR handbook in May 2016, when PCK was still just a flash class, and I guess the chapter didn’t ever get expanded or moved.
After the class was transferred from Val to Luke, Luke was involved in teaching it until the last pre-covid workshop in January 2020. I’m pretty sure he kept the subtraction exercise (that exercise wasn’t removed from the handbook, it just never made it in). A couple other people also taught the class at some point (including Duncan in April 2017), I’d guess at workshops where Val or Luke was absent.
At the January 2020 workshop a new instructor was learning the class & taught some of it along with Luke. I suspect that’s why it was part of the day 1 rotation that workshop rather than being opening night (since it’s helpful for a new instructor to have repetition & smaller groups, and a new instructor’s version of a class hasn’t necessarily cohered enough to be ready to set the tone for the workshop on opening night).
(This history mostly based on records I looked up, supplemented by memory.)
Yep, this all sounds right to me.
I honestly don’t know how to make the subtraction example work in a handbook format. It really does best as something interactive. A lot of the punch of it evaporates if folk don’t get a chance to encounter their confusion after getting their own prompts answered.
Both I and Luke Raskopf tried our hand at teaching Seeking PCK as a full class, and (in my opinion) did a decent job—perhaps 85% as effective as what you were doing.
After that, though, it began to shrink.EDIT: See Unnamed’s comments above.If you’re interested in fleshing out the writeup à la the other full-class entries, I would happily include it as the full class that it indeed was. I just discovered that Turbocharging was also skipped over in similar fashion because of having been moved to a “retired” section of the version of the handbook I’ve been working from.
FYI I think even the current form of PCK described here feels large enough to be it’s own post.
Don’t leave us hanging… Could you please provide one or two more examples?
34-16=12?
63-17=11?
The point was usually to illustrate a variety of different mental motions that a young mathematician might be making, OTHER than the one intended by the algorithm. So all sorts of examples are possible, and there were a number of things found in clinical interviews. Not just instances of one single pattern.
34–16 yields 13
63–17 yields 16
For the four examples of
24-16=12, 53-25=25, 34-16=13, 63-17=16
is this the pattern?
ab-cd=ca
This might just be nitpicking, but given that the very same post is also talking about how valuable it is to get genuinely curious about why people might not be learning well, it seems worth mentioning...
My first reaction reading those examples was not that the people in question were missing the point. Rather I took it to mean that they were at a stage of their learning where they were learning the basic technique and did not yet have it automated enough to have any working memory to spare to also think about the big picture. At such a point, it doesn’t seem clear to me that stopping to ask questions about the big picture even would be beneficial; procedural “how” understanding and conceptual “why” understanding usually develop hand-in-hand, so you can’t reason about the “why” very well before you have enough of the “how” down (and vice versa).
Of course it’s possible to get stuck in only the “how” and not even try to understand the “why”, but to me the examples as written don’t convey that these people are making that particular mistake.
Per discussion below under Valentine’s comment, the Seeking PCK class has been broken out into its own standalone essay. Another flash class (on thought lengths) has been put in its place in this entry.