Feedback-loops, Deliberate Practice, and Transfer Learning
- Skills from a year of Purposeful Rationality Practice by 18 Sep 2024 2:05 UTC; 185 points) (
- Announcing Dialogues by 7 Oct 2023 2:57 UTC; 155 points) (
- Announcing Dialogues by 7 Oct 2023 2:57 UTC; 155 points) (
- Scaffolding for “Noticing Metacognition” by 9 Oct 2024 17:54 UTC; 80 points) (
- Raemon’s Deliberate (“Purposeful?”) Practice Club by 14 Nov 2023 18:24 UTC; 60 points) (
- Forecasting One-Shot Games by 31 Aug 2024 23:10 UTC; 46 points) (
- Upgrading the AI Safety Community by 16 Dec 2023 15:34 UTC; 42 points) (
Moderator note: the following is a dialogue using LessWrong’s new dialogue feature. The exchange is not completed: new replies are added continuously, the way a comment thread might work.
If you’d also be excited about finding an interlocutor to debate, dialogue, or getting interviewed by: fill in this dialogue matchmaking form.
Raemon, you recently wrote Feedback-loop first Rationality. I think it’s one of those “huge if true” ideas in rationality, that a lot of people, including me, have spoken about over the years. (“If we’re really serious about this rationality thing, surely we should be able to train it deliberately and then Succeed On Purpose at hard tasks”).
I’m pretty excited that you’re picking up the torch on this, and am curious about what solutions Raemon’s taste in doing stuff will generate in this area.
[Sidenote to readers: I think this conversation is pretty readable even if you haven’t read the background content]
My first question is: Was there a particular moment, incident, or insight, that caused you to start your current venture into “feedback-loop first rationality”?
I. “Thinking Physics Style Rationality”
The thing-that-would-become Feedbackloop Rationality started in April 2021.
At the time Oli was oriented around “We should aim to build a campus of ~1000 competent people working on x-risk.”
But he had a specific sub-hypothesis of “Also, we need alignment people to have some kind of real feedbackloops. I think they should maybe, like, subclass in physics or material science or something. Alignment is a very vague, head-in-the-clouds domain. It depends on philosophical competence. They should somehow demonstrate their thinking-competence in domains where we can tell they have any idea what they’re talking about.”
But, many people reacted “I don’t have time to do that. Spending months or years on some other domain that isn’t the most important domain seems kinda like just a waste.”
So I was asking “Okay can we somehow speedrun people through this? Do we just want to check their philosophical competence, or train it? How can we do either of those quickly?”
I came up with the idea of trying a Thinking Physics training camp, where people just solve Thinking Physics problems all day, and then doing that again for two other domains that were as different-as-possible from Thinking Physics while being concrete. People who demonstrated competence at three different domains seemed like they’d be reasonably likely to know what the fuck they’re talking about re: Alignment. (I had some dreams that this could evolve into something like Eliezer’s Class Project fiction, where imaginary future rationalist-students have to solve advanced physics in one month).
Then the Lightcone team was busy for 2 years.
I think recently I decided I wanted to just write up the idea, even if I wasn’t planning to do it. Then people (i.e. John Wentworth and Eli and Vaniver) were giving me shit for not having actually tried the idea. And then I did.
It does seem like you left on a perfect cliffhanger! I’m curious what you actually did. But if you don’t want to recount the whole story, feel free to just share something you encountered that you found interesting or exciting.
II. Early history of “Ray learning to think on purpose”
I started putting a timeline together of stuff that feels relevant, and found myself wanting to start further in the past.
This part is The Past, I’ll do The Present in a bit.
2011
In 2011, I join the rationality community. I feel noticeably dumber after a couple years. Less able to have good ideas on my own. My story [flag: story] is that I learned the habit of “Ask smarter rationalists (i.e. Zvi) what they’d do” rather than think for myself.
2015
In 2015, I join Agora For Good, a startup trying to be a new Givewell. I notice that the other people (the CEO, and the lead marketing guy) are constantly having to hype of the company and focus on selling it, and no one’s primary time job is actually think about our company epistemics. Zvi doesn’t really care about the company so he won’t help me. If I want the company to have good epistemics, I have to take responsibility for it myself.
I think the act of having to take responsibility epistemically was something like my actual awakening as a real rationalist.
(I do not noticeably help the company, eventually get fired right around the time I was thinking of quitting, so, like, this doesn’t go that well for various reasons, but I feel mental muscles flexing that I hadn’t used since 2011)
2018
In 2018ish, I read SquirrelInHell’s Tuning your Cognitive Strategies, try applying it to a medium-difficulty-puzzle and immediately see some ways my thought process could be improved. But, I mysteriously don’t do this very often. But, 2-3 times I run a Tuning Your Cog Strategies workshop, each time with ~25 participants, and each time 1⁄4 of people raise their hand when I ask “did a noticeably valuable effect size happen to you?”
...
Also in 2018ish, I read Romeo Stevens say “you should deliberate practice deliberate practice until you can quickly identify and study feedbackloops”, and it feels really exciting to me. I try to practice this on both programming, and on a iPhone game called “Downwell.”
I do not super succeed at either of these, and feel kind of demotivated.
When I try practicing debugging and getting Oli/Jim to help. They weren’t that great at articulating their their thought processes. Oli kinda kept just grabbing the laptop and doing it himself, which made it hard to learn. Jim tries to articulate stuff but it mostly comes in the form of opaque tastes about where the problem lives.
...
Around this time, I learn the Art of Noticing from reading Logan’s blog. Sunset at Noon is the story of me learning how to actually notice confusion and do something useful with it.
Okay, skipping ahead to now:
I write up an initial blogpost on “Thinking Physics” style rationality. i.e. where you spend ~a day trying to solve a Thinking Physics question and get it right with 95% confidence.
I talk to Eli about it, he’s a little excited but thinks I should try it for a day.
I say “I dunno I don’t think I’ll learn much from trying it for a day. I’ve done stuff similar to this before.”
John gives me shit about not having actually tried it. Unlike Eli he stares at me more disdainfully and I feel more sheepish.
Robert and I try it for a day
(I in fact don’t think we learn all that much from it, I learn some object-level-implementation stuff, and it is, like, valuable to have done it for a day)
Ruby takes a vacation. I take the time to actually do a 2-week sprint focused on what I call Thinking Physics Rationality.
Interesting. It occurs to me this sequence ends with two situations where you do not succeed at acquiring a skill (playing Downwell in 2018 and trying to get better at debugging with Oli and Jim’s help).
Presumably you feel more optimistic about attacking those problems now, emboldened by your thinking on feedback loops.
What, if anything, have you realised now that you didn’t know then, that you think would enable you to succeed at Downwell or getting better at debugging code?
To get a sense of your thinking, I’d also be quite curious about how, concretely, you’d approach either of these problems.
I feel like I am significantly better at coding now, and at figuring out how to figure-things-out when I’d previously be struggling.
A thing I am tracking as you/I work [note: Jacob and I have been pairing together on coding for LW lately] is what sort of exercises I could give you in-the-moment that’d be a good middle ground between “letting you flail and learn the hard way” and “just telling you the answer.” (It’s tricky because you’re a different person than I who might need different things).
But I feel fairly excited about integrating better coding-deliberate-practice into our org, if anyone else wanted to try it.
I am currently trying to get better at downwell. I still have not noticeably gotten better. I set myself a goal of “beat the game this week” which I indeed don’t look on track to, and I think it’s a pretty important test. I do feel like I have a bunch of tools I didn’t previously. I’m not sure what a realistic timeframe to beat it is in. I guess I should be tracking how much time I spend in total.
How come you were able to learn this? Why did you not just plateau at your previous level?
Hmm. When did I actually get better? Thinking through history:
in 2021, Jim did try prioritizing learning to teach others, and I think over time has gotten better at understanding and articulating his thought processes.
For a long while, I had a belief I should hire tutors for things. Ruby at some point hired Vlad as a tutor, and then several months later I hired him for a couple days.
at some point, Jim learned to pair with people. He described this recently as a skill he had to learn, which allowed him to stay focused on some things better, and stay in touch with EA Forum people and LessWrong people. I think EA Forum maybe had a culture of pairing (partly because they were a remote org, where it was really valuable for staying in contact with each other?)
Ruby and I would pair a fair amount, and one of the first things we did together when I re-joined him was work together on understanding some gnarly login-issues that were a fairly difficult challenge for us. We’d argue a bit about the right way to do things. Neither of us was actually that good, but pairing helped us stay focused and figure stuff out.
We eventually hired Robert, who had more demonstrable skills, and Ruby and I both paired with him a lot.
So, basically in 2022 we started doing a lot more pairing. I think it kinda turned out that Oli-2018 was… just actually below-average at explaining his thought process. Jim learned to explain his thought process more, I think partly from a combo of pairing a lot (where it’s more necessary to explain your thinking), and partly from somewhat deliberately trying to improve on other axes.
A related bit was during the phase where I hired Thinking Assistants (Sarah and Robin Goins). I made a Roam doc of different thinking tools that I would use. Having Sarah and Robin around helped me notice when I was bouncing off stuff and time to employ a Thinking Tool. In particular, Robin taught me the tool of “take lots of liveblogging notes”.
Also, just, ChatGPT is a gamechanger. I think by now I’ve gained some skills that aren’t just dependent on ChatGPT, but it means whenever I got stuck I didn’t lose momentum because there was an obvious tap of “just ask it what to do next.” I think in the alt-world where ChatGPT didn’t arrive the same year as Pairing-with-Ruby/Vlad/Robert/Jim, I’d have had to get a lot better at googling, which is much harder.
Notable events in Ray-coding-history:
I think a thing I gained slowly over the years, and crystallized a bit from pairing with Ruby, Vlad and Robert was “actually go all the way up the stack trace.” I previously would read the parts of the code that I understood, and then… just not actually follow “okay this part is broken” once the stack-trace left the places I understood because it was scary.
Related: A thing I gained from delib-practice/feedbackloop/think-physics-land is “how to handle overwhelm”. A problem I often run into with coding is “well, I have 7 working memory slots. And I need to figure out 5 new subsystems I don’t really understand. And each of them has 5 sub-sub-problems I need to figure out.” And previously I’d be like “well, I guess I’m just fucked and I need to go get Oli/Jim”, and now I’m like “okay, this will be annoying but if I just log all my thoughts in my journal and keep a written model of what’s going on, I’ll make it through”.
Actual demonstration of success:
refactoring the rate-limit-code, which took some very gnarly spaghetti-code (formed by EAF and LW teams both working on rate limits at the same time), and streamlined it.
building the AutoRateLimit code, which fleshed that out into a good abstraction that was easy to build on. (JP commented that the code was nice to read).
generally not being intimidated by bugs more recently
III. Transfer learning and overwhelm
I have two separate threads I’m curious about. I’ll fire off both questions, and then later we can prune.
First—I’m quite curious about the mechanics of overwhelm. It’s something I’ve personally uncovered in the past when attempting various kinds of deliberate practice, and a phenomenon that I’ve seen often reoccur in others as they attempt to learn a new skill, or introspect on what they find hard or aversive about getting better at something.
You mention “keeping a journal” as a response to overwhelm.
Are there other insights you’ve had about how overwhelm feels to you? Or how you have seen it feel to others? I guess I’m curious about both the mechanics and the qualia (since I think the qualia can hold pointers in the direction of uncovering the mechanics).
In addition, what are other tools you’ve come across that people use for responding to mental overwhelm? (I’m interested in both external ones like “journalling” and mental ones like “notice you’re overwhelmed and pause for 5 seconds”.)
Second, I am reminded of some comments by Phil Tetlock and Bryan Caplan on transfer learning (both on the 80,000 Hours podcast). They are coming at it from the perspective of academic psychology and economics. I will drop them here—and then am basically just interested in your response.
https://80000hours.org/podcast/episodes/prof-tetlock-predicting-the-future/#transcript
https://80000hours.org/podcast/episodes/bryan-caplan-case-for-and-against-education/
Man. First, christ-on-a-cracker this paragraph is intimidating:
And I’m torn between “man, Christ, that is actually pretty compelling evidence.” Psychologists are, like, known for having pet theories they trick themselves into believing. And if they all gave up on meta-learning, geezus those are some major skulls lying on the ground.
But also, it then talks mostly about, like “latin helping you do math.” And that’s not at all a thing I expected to be true.
My explicit beliefs/hopes about transfer learning involve a multi-step engine, and you need all the pieces to be there:
Metacognition. You need to be able to notice your own thoughts, and form accurate models of “what happens when I intervene on these thoughts?”, that you test against reality.
Identifying how a big skill breaks into subskills
Identifying how to train at given subskills
Identifying which subskills are relevant in a new domain, and then applying those subskills
To tie this back to the overwhelm question: I think managing “overwhelm” is a major subskill for basically any other complicated cognitive skill. I identified it while reading Machine Learning papers. I applied it to Thinking Physics. I applied it to teaching. I’m quite confident this is a cross-domain skill.
There are a few techniques I currently know of for managing overwhelm:
Write everything down
This includes “vomit everything out onto a page”, as well as a followup skill of “take notes that are well organized so that you can think through the problem methodically.”
Use some props / doodles.
Rather than try to write everything down, even just making a couple squiggles and shapes on a notepad can help you organize your thoughts. i.e. draw a square and a circle, and then say “Okay, this square represents THIS fuzzy part of the problem I haven’t figured out, and this circle represents THAT OTHER fuzzy part of the problem I haven’t figured out”, and then that makes it a little easier to track.
Talk it through with a partner
Talk out loud
Pace around, using visual space cue to sort of memory-palace yourself.
Identify one small piece of the problem and just focus on that and ignore everything else
Different problems have different requirements on how you might manage overwhelm. If you’re sitting at a desk with a laptop, you might have a huge set of software tools.
But if you’re in a tricky social situation where you’re tracking a bunch of people who are secretly angry at each other, it’ll look suspicious if you start taking detailed notes. So maybe you need to rely on things more like “do some rough doodles on a napkin”, and use one doodle to represent “Alice seems tense?” and another to represent “Bob seems angry?”, and you can’t talk out loud about it but maybe you can surriptiously doodle.
Or, if you can’t even doodle: maybe you look around your environment (say you’re at a dinner table at a fancy dinner), and maybe you move your forks/knives/food around your plate and use it as an impromptu notepad.
Or, maybe you practice the subskill of “politely excuse yourself to the bathroom”, and you actually take real notes on a 5 minute break that helps you manage the problem.
Or, maybe you learn to figure out which stuff you can most easily afford to tune out and ignore, and just focus on one part.
I expect that if you just train overwhelm in one domain (i.e. where you have full control over your environment and you understand the problem-frame pretty well), you’ll train a somewhat narrow way of managing overwhelm. But if you deliberately cross-train it in multiple domains, you’ll come up with tools that handle it in different ways… and if you deliberate attend to what-is-similar and what-is-different between domains, I think you’ll end up better-than-average at learning to manage overwhelm in a third, novel domain, under time pressure.
As I read your notes on overwhelm, I am reminded that I have experienced an additional phenomenon not mentioned here, which was also such that, as soon as I identified it, I started applying it across the board to life and work. I’ll elaborate on it a bit.
As you know, I’ve been working on my pilot’s license. This is relevant for a few reasons:
1) it is very overwhelming: at any one time you gotta deal with some subset of your speed, altitude, direction, centrifugal coordination, the health of your engine instruments, wind, terrain obstacles, clouds, other planes, a guy talking to you on the radio, reprogramming your transponder, your location on the map, airspace regulations, instructor yelling at you / passenger chit chatting… and more, all of which might require decisions on a second-by-second level
2) learning to fly involves me doing deliberate practice trying to acquire a skill.
3) the stakes are pretty high. Simply put, if I mess up badly enough, I die. And what skills and metis I’m picking up, is selected for coming from people who managed to at least not die yet.
It is a common saying that “the cockpit is a terrible classroom”, precisely because doing deliberate practice under this amount of overwhelm is really hard.
Nonetheless, in practicing flying, I discovered a particular skill for dealing with overwhelm. Let’s call it “the unconscious scan”. Presume I asked you to maintain a given airspeed, altitude, heading, and engine RPM, while also ensuring your engine readings remained normal and safe. There are two ways you could do it. Initially, when I started learning I would try to look at the instruments one at a time, verifying each was correct: “airspeed—check, altitude—check, heading—check” and so on. There was an orient step happening for each. The problem with this, however, is as follows: I look outside the cockpit. I check my instruments. And then once I look back outside again I’m suddenly going toward the ground, or in a different direction, or up into the sky. The process is too slow. (That’s a bit overdramatised, but not too inaccurate when you’re just starting out.)
Later, however, I discovered I was able to execute the algorithm of the unconscious scan: “move my eyes across all of the instruments—moving at a regular fast pace that does not slow down to verify that they all had the right values. At the same time silence the voice in my head that would previously read out the checks. So I would smoothly and forcibly move my eyes across all the instruments. And then later, as my eyes are back looking outside the window, the readings start trickle back in like the outputs of async callback functions. ‘Oh, my speed was too fast and I’m veering east, let’s correct’.” Turns out, I was able to batch process a lot of stuff in an async and unconscious way, and this was much faster than doing it all deliberately and synchronously.
I then felt like this skill made me a faster reader: I similarly discovered I could silence the voice in my head and scan text content, and I didn’t have to rely on a deliberate, conscious, verbal narrator to guide me along at each meta step. Put another way: I had already taken in the text in the paragraph. I just needed to trust it was somehow there, somewhere, to be processed. I didn’t have to also articulate it.
I also feel like I was able to use unconscious scanning to do something I saw you mention in a slack comment, when you reflected about getting a headache after we did pair programming:
I find that orienting to things with more of an unconscious scan—mostly going with the flow averbally, and then trusting that I’ll ocasionally have a flag that’s raised for doing a deliberate correction—has increased my mental energy for various kinds of work.
Finally, I’ve been practicing the skill of “enter a room. Force my eyes to scan all around it and then return to a focus point. Then let updates about what items were in there trickle into consciousness” It remains to be seen whether there’s anything useful in that.
For now, I’ll leave you with this anecdote rather than a question, and I’m interested in your reactions to it.
Yeah that makes sense and I find it a useful concept.
What was the process of gaining this skill like? Do you think you can generally “skip ahead” to having this skill, or for each domain you apply it to, do you need to first do a slower, more effortful learning process to build the model of what’s going on with each Notable Thing?
Another way of framing the question: when you imagine switching from “consciously deliberate on each thing” to “do an unconscious scan”, what qualia do you experience in each case? Is there any qualia to the process of switching your intention from deliberate to unconscious?
...
...
Okay, now let me think about how to apply this to myself. Here are some domains where it seems likely relevant:
Downwell.
Tricky conversations, especially when there were multiple people with competing needs.
Let’s start with Downwell.
In Downwell there’s a ton of stuff going on: There might be 4-8 monsters on the screen, each with different movement patterns. Many of their movement patterns are optimized for being predictable-in-some-sense, but also counterintuitive. Various little features of the game push you in different directions. You recharge your gun every time you land (you spend most of the time falling down a well), and when you run out of bullets if monsters are coming at you you’re pretty fucked. But, also, you get bonus powerups if you go a long time without touching the ground.
Here’s a GIF:
For ~4 years, I tried a combination of things like:
Try practicing one particular thing at a time. Like this particular game-run, I’m just going to focus on “not getting hit by frogs.” (Frogs have a particularly annoying movement pattern). But, in practice this is just really hard. If I’m just focusing on frogs other stuff keeps happening to me and even if I’ve decided I only care about frogs this particular run, it’s hard to concentrate.
Try just… doing a pretty good job at everything? It’s not obvious at first glance that this is different from “trying to just do unconscious scans”. Like, I’m generally trying to take in the info and integrate it. I’m clearly doing that to at least some degree.
What’s the difference between “the unconscious scan” and “sorta winging it?? I guess it does still require a conscious step of “actually move your eyes over the whole screen”, rather than just be tunnel-visionedly-focused on the area around your character.
But I also expect this is not going to be sufficient. Empirically, the thing I was doing was very far from working, since it took me 3+ years to beat the name on normal mode, and a few years later I still haven’t beaten it on Hard Mode.
A new thing I’m doing this week is to watch replays of my games, where when I take damage, I go into slow motion and watch everything happening on the screen, and form a complete model of why I took damage in that moment, and what the earliest action I reasonably could have taken that would have avoided it. In the process, I’m building out lots of specific models like:
when a frog disappears offscreen, I need to remember where that frog was, and avoid maneuvering into a spot the frog is likely to land.
when a monster is coming at me from the side, I feel a desire to jump up and then kill them from above. But, actually, it’s often better to run away and fall (you can fall faster than monsters can follow)
sometimes a monster appears to be approach me and I want to run away, but, actually I’d escape easiest if I ran towards them, and then find a spot to fall through (i.e. I have my back to a wall, the monster is approach, but there is an opening in the floor between me and the monster)
Since I’ve started “record, then rewind-and-replay each troubling section”, I’ve gotten better at noticing in the moment all the little-things-that-go-wrong. But, I haven’t yet got significantly better at avoiding those problems.
I’d expect the Optimal Unconscious Scan to require:
Learn to actually move my eyes frequently around and note each relevant object
Learn which things count as notable things to track
Learn what I’m supposed to do about each thing. I think this requires a fair amount of intensive training.
That said, I did just try playing one round of downwell where I introduce “move my eyes to each new object on the screen”, and I did do better than usual. It also felt like a new concrete skill I was doing. So, yay. Thanks!