Raemon’s Deliberate (“Purposeful?”) Practice Club
Introduction
So. I have a theory of feedbackloop-first rationality.
It has a lot of parts. I think each part is promising on it’s own, and I have a dream that they interconnect into something promising and powerful. I also have a standard, which is that you should be able to tell if it’s helping.
One of those parts (I think/hope), is “the generalized skill of Deliberate Practice.” That is, the meta skill of:
Noticing that your goals are bottlenecked on some kind of skill (or skills).
Figuring out what those specific skills are.
Figuring out who can teach you those skills, or, how to teach them to yourself.
Creating an explicit practice regime.
Actually putting in the work to practice.
Noticing when your practice isn’t working, and figuring out how to troubleshoot your process.
I do not currently have this meta-skill. I am kind of betting that it exists, based on reading books like Peak, talking with Romeo Stevens, and reading stories like László Polgár who methodically taught his daughters chess.
I think I’ve made progress in the two months I’ve been working on it, but that progress hasn’t translated into “I quickly gained multiple skills” yet, which is the standard I feel like I should set for “this is actually working well enough that other people should be paying attention.”
I’m experimenting with using this my dialogue format for journaling my explorations here. I’m inviting a few people I know well to be top-level dialogue participants. Everyone else is welcome to follow along in the comments, and note down their own deliberate practice experiments.
This will include a mixture of high level theory, and day-to-day practice notes.
Okay, reviewing some of my goals here. Here are things that feel like valuable end-goals in and off themselves.
I want to get better at prioritizing projects at Lightcone. Right now I feel very “in the dark” about whether anything we do is even helping. I have some guesses for the subskills here.
I want to figure out whether/to-what-degree the Meta Deliberate Practice skill can meaningfully be applied to “research” (alignment research in particular, but also generally).
Get better at programming.
Get better at emotional regulation. Moderately often, I get somewhat annoyed about something and it makes a conversation go worse (or, builds up some small resentments over time)
Get better at sleeping, somehow.
Get better at Downwell, (a game that I have struggled with beating for a long time), quickly. (This one is mostly for fun)
The actual point of this project are the first two bullets. The thing I feel most excited about “rationality” for (compared to, like, learning specific skills, or other frameworks for dealing with problems), is to solve problems that are confusing, where having an accurate map of the world is likely to be your primary bottleneck.
The latter bullets are things I care about, but I’m mostly interested in them right now from a lens of “looking for things that seem genuinely worth doing that feel more tractable to practice.”
Some particular subskills that I feel interested in practicing, but mostly because I believe they somehow help with the above:
Get better at making calibrated forecasts (related to decisions I care about).
Get better at Thinking Physics problems (I think of this as a testing ground for some subskills related to research)
Estimation (i.e. find concrete things to practice estimating, with an eye for getting better at estimating value of fuzzy projects)
I want to make a terminological note that may not be that helpful but it is at least related and might be interesting. I recently read “Peak”, which is the pop-sci book by K. Anders Ericsson, the discoverer of deliberate practice. In it, he uses another related term which is “purposeful practice”. My memory from reading the book is that the only difference between them is that deliberate practice = purposeful practice using a pre-existing training regime that is known to be effective.
I think part of his reason for making this distinction is that, even if you do everything else right via purposeful practice (high focus, rapid feedback, well-defined goal, pushing your comfort zone) you could spend your whole life doing that and, although you would get really far, you could get dramatically farther by following a training regime that all the previous purposeful-practicers figured out was the best. And so you should spent a lot of resources looking for experts, mentors, et cetera.
I’d guess that what you, Ray, will be doing is almost always the purposeful practice version.
Yeah, I ran into the purposeful/deliberate distinction recently (it was actually before I started writing this dialogue prologue, but it felt sort of effortful to add the extra distinction and I was lazy. :/
(I’ve included the distinction in some of the in-person Deliberate Practice Club stuff I’ve done)
Raemon Playing Downwell
I had been pretty excited about coming back to Downwell, after trying vaguely deliberate-practice shaped things with it for years without seeing rapid progress.
Unfortunately, after a month of trying pretty hard practicing a bunch, I have still failed to get much better at it. I did become very aware of lots of little subskills I’m missing.
I think part of the problem is I actually really need my peak cognitive/reflexive hours of the day to have a shot at improving at the game, and I just can’t really spare them from other more important things.
Downwell is a reflex game where you’re, er, going down a well. You can shoot bullets out of your boots, which a) kill monsters below you, b) slow your fall, allowing you to control your speed. You need to fall through 4 zones, each one 3 levels long, and then beat a final boss.
There’s a variety of powerups and mechanics that add complexity. At first I focused on beating the whole game, but found there was a lot of different things to track which made it hard to actually practice any given skill.
I set the intermediate goal of “be able to reliably beat stage 1 of Hard Mode, without taking any damage.”
I feel like I got slightly better at this at my peak, but it was hard to sustain, and I don’t think I hit a stage where it was demonstrably clear that I was “really better” instead of “getting into the zone once or twice.”
List of things I tried:
practicing jumping back and forth, precisely
recording videos of myself, and watching them in slow motion
taking notes about what caused me to take damage each time, and what actions I could have taken instead.
watching videos of speedrunners
looking up strategy notes on the Downwell Wiki
translating my observations into the Downwell Wiki
practicing “only falling, no jumping or landing on enemies’ heads to kill them”
practicing looking briefly at each new object that enters the field
practicing focusing on “negative space”, i.e tracking each place where the monsters aren’t
practicing keeping track of how many bullets I had fired, so I was never surprised by running out of them.
Overall… I feel like I gained lots of little awarenesses of things, and I even feel like I learned a lot of specific new skills… but somehow it didn’t translate into getting noticeably reliably better.
Whoa! I’m really surprised you didn’t improve much after a month and tried all those specific things. That list is basically exactly the kind of thing that I would expect to work. Out of curiosity I’m going to brainstorm a list of possible explanations I can think of:
You have hit your inherent skill limit at this particular activity (maybe because it’s reaction-time based and your reaction time is fixed)
This particular activity has an improvement curve such that improvements take longer than a month
You got really unlucky when choosing that particular set of skills to practice (and if you practiced another similar list of 10 things, you would have gotten noticeably better in a month)
Potential improvement is secretly building up inside you but will cash out in performance in a bigger burst
You’re doing it too intensely, and more/different breaks/spacing would have been better
Your sleep is bad so your brain never “consolidates”/”propagates” the lessons from the practice
You are actually getting better, but just measuring it badly
...you have a psycholocial resistance to getting better at this game… okay now I’m scraping the bottom of the barrel (note I am not babbling here)
My sleep has actually been fairly bad this month so that hypothesis is not crazy.
Another major hypotheses: I would often play while tired at the end of the day, because, well, it’s fun and my impulse control is poor then. I’ve heard that can train bad habits. I’m currently anxious about that maybe undoing all the good, and this might generalize to like my entire life.
That all said:
This past week, I started to get a feeling of Downwell skills solidifying. It still hasn’t translated into being persistently better, but there’s a particular mode I sometimes enter where I seem to do better, and I think I can tell when I am and am not in the mode.
The difficulty appears to be that there’s like 3 skills I needed to access that mode, which is something like:
ability to quickly fall
knowledge (in your muscles) of when/how to dodge various kinds of configurations of enemies
ability to notice when it’s time to stop falling
ability to not get sucked into the alternate playstyle where you’re not falling.
habit of frequently, briefly landing to restock your bullets (which you can spend to slow your fall as well as shoot enemies)
Basically no single mode of playing the game seems “sufficient”.
A central element of Downwell is that, when there’s overwhelming stuff all around you, it’s usually safer to just let yourself fall rather than try to deal with all the overwhelming stuff. You fall very fast, so you will easily outrun all the monsters around you, and most of the time you’ll end up in a section of The Well where there are fewer complications to deal with.
The problem is that you fall very fast, which means you’ll eventually accidentally hit something dangerous when falling at speed.
There is a simple, elegant way to play the game where you’re always falling, and there’s a simple, elegant way to play the game where you’re always landing on enemies’ heads and building up combo points (which give you more resources, which help you later in the game).
I find hovering in the blurry lines between these two playstyles fairly challenging.
What I find most confusing is… I feel like there’s some kind of conservation of damage I take. I’ve gained little subtle bits of skill over the past month-and-a-half. I’m confident that I now have the ability to jump between two monsters that are coming at me from slightly different angles, where before I would have just had no ability to escape. I’m confident that I’ve learned how to “not accidentally jump” (I used to be very trigger-happy with the jump/bullet button, such that I’d accidentally jump up into a monster right above me).
But I kept… seeming to reach level 4 with roughly the same variation in “how much damage I took”.
I think maybe there’s just variation on “how tired/attentionful I am”, which dominates, and the skill I was gaining was just smaller than that variation?
Lynette’s Catching the Spark exercise
[Context, this is the very long result of my attempt to apply Logan’s naturalism sequence to get more surface area on optimizing learning. It’s an ongoing exercise, so I’ll post updates here or in comments.]
Part 1: Articulating Stories
Optimizing for learning – I want to measure learning by plan/insight shifts for concrete reasons, so that I make progress instead of getting stuck playing around with insight porn.
If I’m not changing my opinions/decisions, I’m probably not learning efficiently.
How often? How to measure? – the statement is right and feels useful, but I want to check that and figure out how I can do it. If I’m wrong about how I make progress, I want to know that.
Part 2: Squinting At Stories
Assumptions
I mean something by “progress”
I mean something by “concrete reasons”
Plans change for reasons
Plans changing for tangible reasons (ones that you can point to) is good.
You can measure learning
I want to measure this learning, not just let it happen
I want to be really specific about how to measure it, so that other people could replicate it
I’m worried that the default option is insight porn, not progress, when people try to learn something
I’m specifically worried that people trying to “learn” do something very wrong and different from what you do when trying to update a decision
You can make progress by measuring concrete changes
I mean something by “progress”
Specifically, I mean progress toward accomplishing some other goal. Learning here is an instrumental step, not a terminal goal. I might not know precisely what that goal is, but I should at least vaguely have a desired outcome in mind.
Progress should be visible, meaningful – you should be able to point to things in the worlds that are different, importantly different. Progress doesn’t matter if it’s so small or noisy you can’t be confident it’s there.
Progress should last. I care about changes you can view a year later and tell that it’s still a meaningful difference.
Maybe I’m overloading this word? I’m expecting something very grand here, and most things don’t cause grand changes?
I mean something by “concrete reasons”
Specifically, I mean “tangible” or “can verbally describe why I changed this plan, at least as long as I can check my notes”
I’m worried that the default option is insight porn, not progress, when people try to learn something.
What evidence do I have for this? Vague sense that I’ve seen lots of people try to do skilling up projects that didn’t work well, and also lots of people trying to do career exploration or research direction exploration and failing to make progress that felt justified.
There are two separate worries here:
1. That people trying to do learning feel like they’re getting lots of little insight hits, but can’t recall anything that had lasting change months later. When I did 1-year coaching check ins that once, most of the people had a hard time remembering what we worked on, let alone what had lasting impact. This probably indicates that people are bad at remembering what caused them to change their mind, but might also indicate that they just didn’t change things long term. I’m slightly inclined to believe the second, because some of the clients I’ve worked with for years still have the same set up issues (e.g. anxiety or ADHD symptoms), so I remind them of the same strategies that they’d previously discovered helped. It feels here like we help the underlying issue a bit, but it’s a constant maintenance game because the underlying issues don’t change.
2. People attempt to do “learning” like they’re in school. They work through problems and assume the knowledge will be good for something, sometime. They don’t approach it as though they were trying to efficiently find out what would make them change their mind, nor as though they were trying to gain the skills necessary to actually do something. If I’m trying to optimize for “learning” as the outcome, I’m at risk of falling into this pattern unless I have a correction mechanism.
These both seem like valid worries, but I’m less confident that this is the default outcome. I wouldn’t be surprised if it was, but I need more data here before I can conclude that. Maybe measure how many “aha moments” clients had that they forgot or turned out not to work?
You can measure learning
First, this seems inherently somewhat hard, because you’re trying to find the unknown unknowns. Sometimes you have to look in places that turn out to be dead ends, and you don’t know in advance for sure what each will be.
Cruxes seem like one measurement of learning. If you can identify key insights/facts that change your decision/opinion, that’s a kind of measured impact of learning. Or if you could write a summary before and after reading something, so you can point to what felt fuzzy before and now you have a sense of?
Learning by Writing just seems like a way of doing this. If Holden is using a different method for learning than most people, maybe doing something closer to that method could result in better outcomes for most people? (Lots of ifs and assumptions here)
I can point to lots of tests I’ve run. Most of them had a kinds neutral outcome – nothing specifically violated or confirmed my expectation. I’ve had a smaller number that yielded a big aha moment, and changed what I did as a result. It feels really important to identify what increases the likelihood of those aha moments, and learning new info.
I want to be really specific about how to measure it, so that other people could replicate it
I don’t just want to have a sense this is important, I want to know what kinds of practices lead to big changes. This exercise seems to be helping, filling out reasoning transparency template for an idea helped, the Sprint book worked.
What would it take to make a similarly useful version for career exploration? A set of instructions that would actually walk someone through to engaging deeply with key cruxes and resolving them? Ahh, this feels super shiny.
This feels like a big black box, important but unclear how or what I’ll find out.
Cruxy: Measure changing decisions
Because how is a big hole in my understanding. I get a sense of importance, bigness, and also handwavy vagueness when I think about it.
You can make progress by measuring concrete changes
I have a strong sense I want to optimize for “aha” break throughs. If I’m doing deliberate practice, I want to maximize the number of realizations for how to do something better. If I’m making a startup, I want to maximize the number of iterations making the product better in response to customer feedback.
I have stories that concrete changes lead to good outcomes. Is this because they’re easier to measure? Maybe?
This is tied to a dread fear of spinning in circles and never improving, and the solution I’m grasping is feedback loops. I don’t know if they’re sufficient, but feedback loops at least feel necessary.
New statement: I want to be able to describe how I measure learning by plan/insight shifts for tangible reasons, so that I make progress instead of getting stuck playing around with insight porn.
What does it look like when I measured learning in the past?
Part 3: Locating Fulcrum Experiences
Where does the data live?
I can see if I feel a “aha” moment, those seem relevant. More specifically, maybe aha moments, followed by action? Or writing out the reason for the aha moment?
Maybe try looking for which types of actions or situations led to aha moments?
Maybe working backwards, I should be looking for types of situations where I’m searching for an answer? Where there’s something I want to understand, and I’m trying to look for it?
I identified a few times it felt like I had a clear shift, I can look at some of those more closely (This exercise seems to be helping, filling out reasoning transparency template for an idea helped, the Sprint book worked. Some of the weekly hypotheses helped)
Deliberate practice exercise
Insight: My feedback loops were way too long
Experience: Something like, just looking at the list of criteria and rating how I did, feedback was obviously poor. Then checking what my feedback loops were. Maybe a sense of handwaving, that it would have been hard to describe what feedback I was expecting? Hard to remember.
Reasoning transparency exercise
Insight: I think there’s something valuable to lean methods but I don’t have a clear thing that will be helpful
Experience: First, it felt good, that I was writing out my arguments. Then, a sense that I wasn’t sure this would be useful, and that there wasn’t a clear take away here. There were a bunch of useful sounding threads, but I wasn’t sure how to use them. Maybe vagueness, a sense that I was lumping together overly broad stuff because they were “somehow” related? I concluded both that I had a really strong case for the general thing, and that almost all of the benefit came from more specifics than I could currently write well about
Catching the spark exercise Deliberate practice exercise -
Insight: Being able to describe in detail how to measure learning is crucial
Experience: I was writing in response to the prompts, felt like exploring? Then the measuring so I could describe it felt super shiny and at the same time handwavy. Like I knew this was the most important part, but when I asked myself what I would write, there was a big smudge on the mental map and I was struggling to fill in the details.
Sprint exercise
Insight: I was totally misguided about the blockers to ea therapists
Experience: I was asking questions of therapists and potential clients expecting a certain type of answer, but I knew I wasn’t sure and I wanted to get info really quickly. As soon as I started talking, the issue of limited therapist availability came up. It was a surprise, something that didn’t fit my prediction, and it was immediately obvious I needed to change tracks.
Experiences:
Handwaving
Feeling like something is important but vague
Not feeling quite sure how the pieces fit together
Knowing I’m uncertain or don’t have enough evidence
If I’m trying to zoom in here, learning to notice handwaving or “important but vague” areas seems good. However, this thread only feels like one part of the thing I actually want to study—I want to learn how to set up the conditions where it’s easy to measure learning, and it’s not clear if this exercise would lead there or not.
PUZZLE GAMES
I talked here about beating Snakebird with no hints. I considered myself to have basically beaten puzzle games until I got about halfway through baba is you. Around
110 out of 231
I began to really struggle. The techniques for Snakebird only partly transferred. Partially because BIY is more complicated, but also I felt less able to break down my moves. I often just “saw it” in BIY and having explain or extract meta principles felt false.
I ground through another 20 levels, and got a friend’s help with 2 or 3, where learned that I often tried to simplify the problem by doing moves I “had to” at the beginning, which turned out to be incorrect. I used this insight to beat another 5 puzzles or so. I’m now ~60% the main game, I’ve had access to the “final” puzzle for ages but am absolutely stumped on it.
DRAWING
gave myself tendinitis, quit
TYPING
tried typing, maybe made some progress, but it was using up a surprisingly amount of emotional energy that I couldn’t justify so I stopped
PLANNING
I want to get better at planning out projects instead of just muddling through them. This involved a lot of steps that weren’t deliberate practice, which I’m leaving out.
I’m currently trying to write out delegation instructions to myself, as if I were a competent assistant without much context. I originally tried this with my big main project, and it failed immediately. Doing it with small projects seems to be in my zone of challenge, so I’m practicing that as able. This will only pay off if I build the ability to apply it to larger projects, the projects it works for don’t actually need it.
EVN, I wonder how much the planning changes based on what the project stages? I often have a period of puzzling around trying to figure out what I’m trying to do, then stages where I have a concrete plan for a bit, then that plan goes off the rails somehow and I’m back to muddling around. As the project gets closer to completion, the landscape shifts toward more planning and less muddling.
If this generalizes to you, lots of muddling in bigger projects might indicate that you’ve got along way to go on the project, possibly more than you were thinking. Does that resonate with you?
Part 4: Getting your eyes on
Attempt 1:
Honestly, this question of “what does it actually mean to measure learning?” seems like a decent example for the getting your eyes on exercise.
Object level, it feels like there’s something important going on about that prompt that I won’t discover if I’m just following the prompt of noticing “important but vague” (ibv) experiences. Relevant threads:
Am I only looking where I’ve noticed successful aha moments before, and hence neglecting places where I could have had but failed to have an aha moment?
I don’t think the kind of insights come from chance. I will occasionally have relevant shower thoughts, but they almost always come within 24 hours of me spending substantial time engaging with the content. Making time for deeply thinking about my topic seems crucial to the process.
Maybe I’m going about this in too roundabout a way. Maybe if I’d get further faster if I tried a sprint on it or just directly tried to answer the question, maybe read up on other’s solutions, and test out theories for myself. Naturalism is trying to come up with hypotheses that don’t exist yet, and maybe there’s enough here that I don’t need to start with naturalism.
Maybe I’d learn more if I combined naturalism with a targeted experiment attempting to quantify learning in advance. How I would design that experiment feels important and handwavy, though…
Where would I be able to gain meta insight into which of this threads it’s worth pursuing?
Meta: I had this question keep niggling at me. Various versions of it kept popping up, even creating some aversion around continuing with the exercise, because what if I was just wasting time on the wrong exercise or wrong experience? It would feel really silly if my experiment geared toward figuring out how to measuring/bring about learning…didn’t bring about learning. Some version of “this question keeps coming up, and I don’t know how to resolve it” seems like a prime trigger for looking for handwavy important stuff.
When actually trying to write out the handwavy stuff, I had lots of questions or possible threads, but it wasn’t clear which would be valuable. There was a feeling of overwhelm and uncertainty. I could choose arbitrarily, but that felt like guessing on a test. I didn’t have a good method for deciding which questions to pursue, and I want one.
What are the triggers that let me predict which experiments will yield insights? This feels crucial to my current quest, and also would resolve issues that pop up with research as a stochastic decision process. Possibilities:
Set up an experiment, including how I’ll track the data.
Try a sprint.
Get feedback from someone who gives good feedback, generally or on this topic
Make predictions where different outcomes would cause me to have different updates. Abandon experiments where it’s unlikely I’ll have an update, and focus on ones where I can guess I’ll have specific decisions based on likely inputs. This seems obvious, but I don’t think I was specifically designing experiments such that I was likely to change plans – most of the time, I expect to get kinds vague results. This would imply I shouldn’t run those, and should optimize for high variance experiments.
That last one feels super shiny! I should examine it more closely.
Attempt 2:
Trying to reorient my “engagement” exercise to super focus where I’m likely to get feedback.
Just seeing what people like or comment on isn’t really giving me enough feedback.
Seeing if people will pay for my writing or give quality feedback, that’s a costly signal of value. And I’m probably at the point where I should be making some costly asks and then tailoring my content for the people who really value it. Even if that’s just a few people, I want to be writing to even a small group who loves my writing.
Maybe I could reach out to a few subscribers who mailchimp rates highly, ask for a feedback call, with compensation? This sidesteps the issue of surveys getting scant data for the amount of user attention they demand.
Meta: Asking for “costly signals” to identify people who “really value” my writing seems both important and vague. There’s a part of my brain that wants to call this a plan and just go execute it. Because then my problem is solved, or at least my brain is relieved of the stress of trying to figure out the solution.
But my vague idea of sending out a reader survey (maybe with an ask for permission to follow up with individual readers), doesn’t fit my ideas of how best to get information. Single piece flow for user testing interviews should get higher quality information, according to my intuitions and Lesswrong’s practices.
So trying to think more carefully when I notice the handwavy experience led to a plan change right there. I think there was something in “this is important, and I can just solve it” that feels reassuring. I’m not stuck frustrated or thinking about how likely it is that what I’m going to do won’t work. I think I need a bit of that bluster to not get totally demoralized, but also I think sending out a user survey would have been way more likely to just not get me good information. User testing interviews are more likely, but I still want to put some thought into how to get good feedback so that they don’t flop either (although they do have the advantage that I can iterate quickly if the first couple don’t get good info).
New hypothesis: The handwavyness is playing a role here – it’s stopping me from getting bogged down and demoralized when I’m uncertain. Does this generalize to other examples?
Object level: I want to lean really hard into this idea that each experiment should be likely to change my plans. It seems like a good step toward my goal of optimizing for measured learning that changes plans.
To try: For each hypothesis I want to test next week, try to put myself where the data is as richly as possible. Be really wary of doing “easy” experiments where I already have a good idea of what feedback I’ll receive or where I expect to get kinda muddled, uncertain results.
I identify a lot with the struggle of “man, how do I actually measure progress here, in a way that isn’t overwhelming?”
I think “how to actual measure progress at learning” is one of humanity’s big hamming problems. My current take, from Feedbackloop-first Rationality, is to do think of myself as embarking on a long project to improve my ability to measure, without worrying too much about whether I’m current able to succeed at the level I’d prefer. i.e. ask the question “how can I incrementally do better-at-measuring than I previously have?” without letting the perfect be the enemy of the good. And be transparent-to-myself about my current capacity to measure.
In practice, for me that is:
Do some concrete exercises with clear, unfakeable feedbackloops (which maybe are goodharty, or only one piece of the puzzle, but are at least unfakeable)
Do some “vague good thing I believe in”, and do as honest-a-retrospective as I can on it so that I can improve for next time in a less goodhartable, but more fakeable, way
Hope those two things average out somehow into something good.
(other topic)
A thing I’m noticing right now is that, while Downwell didn’t go that well, I really should have at least 3 things I’m practicing, so that I can triangulate “how to learn?” between them.
Yeah, I agree with not worrying too much about whether I’m already at the level I’d like to be at.
However, I have an intuition that most methods of improvement here are way slower than the best ones, and I might be able to go a lot faster if I’m deliberately paying attention to speeding up my feedback loops. At least, that’s the hope for the current stage of my experimentation.
What’s your rational for having three things to triangulate between? I find I usually want to focus on one practice at a time.
I’m not necessarily saying ‘practice all three at once’, but I’m trying to avoid the lost purpose of ‘getting good at X’ when what I’m actually trying to do is develop the general art of getting good at things. In particular because in this case I was starting with an X that was optimized for ‘easy to measure’ rather than ‘actually important’. (Ie Downwell)
Ah, makes sense.
Notes from Getting my eyes on, take 3
1. There’s probably a trade off here where the thing I actually want to optimize is (“chance of insight” divided by effort required). I want experiments that have a high probability of new insights, relative to the amount of effort/time it takes me to run the experiment.
2. Right now, at this stage in my experiment, “optimizing for measured learning” feels like throwing out a lot of potential experiments because I’m pessimistic they’ll actually teach me something. I might be surprised and find my model was totally wrong, but in most worlds I’m not going to learn much. So I’m throwing those out and just prioritizing experiments that I expect will yield actionable insights.
I don’t think this would have been possible if I hadn’t studied experiments so much. I just wouldn’t have intuitions around which experiments are likely to yield novel insights.
I notice I’m flagging two slightly difference experiences during collection:
1. “There’s something important, but it’s vague/I have a hard time describing it in detail.”
2. “There’s something that seems important to do, but I’m not sure exactly what to do.”
They overlap a bunch, but they’re not the same. The first has a sense of certainty, “of course this is important!”, but my mind scrambles when I press it for details (It usually feels like I could summon the details, I just need focused time to remember them). The second feels confident when I’m planning the task for the future, yet has a tiny buzzing sense of uncertainty, often with a mild ugh field, when I actually try to do the task.
Collection 3
A client suggested I help them create a doc tracking what career options they’ve considered, what tests they’ve tried, and why they abandoned certain ideas. My mind immediately latched onto this as shiny – something important and relevant to my attempts to measure learning. If I had a neat way to track plan changes, it would help me (and everyone else who forgets why they changed their mind) from needing to retrod ground or accidentally writing off good ideas without a good reason. I’m vague about how I’d do this, but ideas coming to mind include Kit’s probability thing plus a description of why you changed your probabilities at each time point, or a branching tree of options nested with tests and results that you cross off options that seem wrong with a tldr from what about the tests changed your mind.
I’m noticing that I’m marking when something is relevant to my investigation, even if it’s not precisely the intended experience. This time was like “shiny!” and immediately relevant in my mind, like I’d just stumbled across a potential key. It was much more of an “aha moment”, rather than the “vague but important” feeling that I was marking as a precursor to the “aha moments”.
High variance experiments
I finished up my user testing interviews. These came out of my attempts to focus only on “high variance” experiments that were likely to give me new information.
On the surface, this experiment nailed it. I wanted to find out what would increase engagement on my blog. My experiment was to do only the experiments that felt highly likely to give me concrete insights that would change my plans. I started with a mile long list of hypotheses and ended with two concrete interventions that seem really promising. I now have better evidence for these hypotheses than the others, and I’m going to implement them. This worked great for increasing the chance of new information that changes my plans.
But I also want to be able to tell later if those plan changes help me better achieve my goal. It feels similar to Ray’s “concrete/less-relevant” to “highly-relevant but less concrete” spectrum, which I’m going to phrase as “lead” and “lag” measures. My lead measure is insights that change plans (concrete, hopefully-but-not definitely relevant), and my lag measure is whether those plan changes actually increase engagement (highly-relevant but hard to measure). If they don’t, then my lead metric needs to change.
Right now, my best guess is to implement the changes, and compare the rate of blog email signups before and after, throw in some ad hoc and probably bad attempts to control for confounders, and see what that indicates. It will, unfortunately, be a few months before I get data.
(Honestly, I feel a little silly at how well “just do experiments that you think will teach you stuff” worked.)
Did you have previous experience where experiments weren‘t teaching you as much that your comparing this to? I assume so but wasn’t entirely sure.
...
I just reached the second to last level of Downwell Hard Mode!
I had set my goal “beat the first 3 levels (“zone 1″) on Hard Mode without taking any damage”. I grinded away at that for a month and didn’t seem to improve too much. But,
a) I eventually unlocked some kind of zen mode that combined a few skills
b) then I went back to playing all the levels instead of restarting the game at level 3. And I think a lesson is that each zone had some “low hanging fruit” I could learn. I had exhausted the low hanging fruit of zone 1, and the next skills required to advance at “zone 1 taking no damage” were quite hard to acquire. But learning the basic patterns of the enemies in zones 2-4 seems like it’s dramatically improved my range.
>Did you have previous experience where experiments weren‘t teaching you as much that your comparing this to? I assume so but wasn’t entirely sure.
Yeah, I’ve been doing regular experiments for a while, but I wasn’t pushing as hard for the ones where I didn’t already have an idea of what the outcome would be.
- Skills from a year of Purposeful Rationality Practice by Raemon (18 Sep 2024 2:05 UTC; 185 points)
- Rationality Research Report: Towards 10x OODA Looping? by Raemon (24 Feb 2024 21:06 UTC; 114 points)
- Bayesian updating in real life is mostly about understanding your hypotheses by Max H (1 Jan 2024 0:10 UTC; 63 points)
- Upgrading the AI Safety Community by trevor (16 Dec 2023 15:34 UTC; 42 points)
- Raemon's comment on Patient Observation by LoganStrohl (13 Jan 2024 21:33 UTC; 9 points)
- Raemon's comment on What Are You Tracking In Your Head? by johnswentworth (17 Jan 2024 1:15 UTC; 5 points)
- trevor's comment on Don’t sleep on Coordination Takeoffs by trevor (28 Jan 2024 2:35 UTC; 3 points)
I’m predicting that much of the stuff that causes measurable cognitive improvement will be by the mechanism of making people spend less time on social media or otherwise dithering about on the internet.
e.g. something like 20% of the measured benefit from things like reading the Sequences, the CFAR handbook, singlemindedly playing a specific indie game, are from being the rare thing sufficient to shake people out of habits formed around the addictive gravity-well of the commercialized internet’s click/scroll maximization.
People should not have “shower thoughts”; in the 90s and 2000s people would zone out and have “shower thoughts” while reading books, the extropy email list, and sometimes even watching TV.
Specifically, somewhere around a 20% chance that >30% of the benefit unexpectedly comes from this dynamic, and a 50% chance that 10-30% of the benefit unexpectedly comes from this dynamic.
If MIRI or CFAR or EA’s extremophile ascetics were already successful at getting their best thinkers to consistently spend time thinking or pacing or writing on whiteboards/notes after work, instead of on the commercialized internet, that’s a strong update against my hypothesis.
I already expect feedbackloopfirst rationality to cause substantial cognitive enhancement on its own. This problem is a confounding variable; the goal of the practice club is to find ways to amplify intelligence, but the experiments will show high measured effectiveness from things like singlemindedly playing a specific indie game or driving or taking long showers, even though the causal mechanism actually comes those things increasing the proportion of time a rationalist spends thinking at all, not increasing intelligence or mitigating intelligence-reducing feedback loops.
People need to already be spending 2+ hours a day distraction-free, in order to see if the results are coming from cognitive enhancement, rather than from distraction-removal like long showers or driving.
Regarding ‘shower thoughts’ and ‘distraction-removal’ as far as its’ relation to cell phones and youtube videos and other ‘super fun’ activities as one might call them, I definitely think that there’s something there.
I’ve long had the thought that ‘shower thoughts’ are simply one of the rare times in a post-2015ish world that people actually have the opportunity to be bored. Being bored is important. It makes you pursue things other than endless youtube videos, video games, porn, etc. As well, showering and washing dishes and other ‘boring’ activities are meditative!
It’s a common meme these days that people need to always watch something while they eat. Some people listen to podcasts while they shower. Some people use their phone at stoplights. All of this points to a tendency for people to fill every single empty space of any kind with content of some sort, and it really doesn’t seem healthy for the human brain.
This is an interesting video I watched today while filling every single empty moment in my life with content like I’m being disparaging about, and it relates to the topic. The author describes a process by which you can actually do the sorts of things you want to do by making sure there isn’t anything else in that block of time that’s more fun / satisfying / engaging. If work is the most fun thing you’re allowing yourself to do, then you’re going to work. If you’re locked in a room with a book and a cell phone, you’re going to want to use the cell phone. If you just have a book, you’re going to read the book. You can apply this principle to your entire life.
Sorry if this post seems a little chaotic, lots of thoughts and I didn’t have the time or energy at the end of the day to link them together more coherently...
This comment is potentially vastly better worded/explained than my original comment. I will probably be quoting it when describing this problem, and will make serious efforts to write more like this in the future.
Thank you! That’s very kind of you to say. I haven’t spent a lot of time ‘assimilating into LessWrong’ so I sometimes worry that I come off as ignorant or uninformed when I post, it’s nice to hear that you think I made some sense.
Really dumb question, but Downwell definitely does not do dynamic game difficulty balancing, right?
I don’t think so.
My current best guess of what’s going on is a mix of:
It’s actually fairly cognitively demanding for me to play at my peak. When I do beat level 3 with full health, I typically feel like my brain just overclocked itself. So I think during normal play, I start out playing “medium hard”, and if I notice that I’m losing I start burning more cylinders or something. And if I start off playing quite hard, I get kinda tired by level 3.
But also, there’s a survivorship bias of “sometimes if I’ve taken damage I just give up and start over”, which may mean I’m forming an incorrect impression of how well I’d have done.
For Downwell, have you tried Floaty mode? I found it made the game a lot easier/less frustrating. Especially when you’re trying to rack up bigger combos—and doing so as much as possible is IMO essential given how sparse upgrades/health/money are.
I recently wrote a Question about learning that lacked a lot of polish but poked at a few of the ideas discussed here. I haven’t had time just now to read the entire post but I plan to come back to it and comb through it to try to shore up the ideas I have about learning right now. I’m also reading Ultralearning which is interesting although a little popsci. I find all this stuff really interesting because I’ve been having a lot of trouble learning things lately, feeling like my brain just isn’t working like it used to since I got covid. I’ve tried programming probably 5-6 times in the past in my life and I’m giving it another go now, hoping it can stick this time.
Also, regarding Downwell: Try playing without ever jumping, just falling. Fall on enemies that are bounce-able without ever jumping or shooting and see how deep you can get. You can get pretty far this way!
When you say you were practising Downwell for the course of a month, how many hours was this in total?
Rough guess, ~45 hours.
Try taking one level at a time and pausing between levels. You might just get frustrated and getting some freshness will help