Exercise: Planmaking, Surprise Anticipation, and “Baba is You”
This is an exercise about Planmaking and Surprise-Anticipation. It takes about 2-3 hours. It’s a small, simplified exercise, but I think it’s a useful building block.
Humans often solve complex problems via iteration and empiricism. Usually, trying to figure everything out from first principles without experimenting is a bad idea. You can spend loads of time thinking, and then you go outside and interact with reality for 5 minutes and realize all that thinking was pointed in the wrong direction.
But some important problems have poor feedback loops, such that iteration/empiricism don’t work very well. Experimentation might take a really long time, the results might be noisy, or you might just really need to get something right on the first try.
Often, when making a plan in a confusing domain, it’s enough to just ask yourself “how do I expect this plan to turn out?” to get you to notice ways the plan is likely to fail. Then you can fix those things. This is often faster than doing the entire plan, and watching it fail, and then doing it all over again.
Side note: a particular worry I have is that a lot of people entering the AI Alignment space don't feel like it's tractable to tackle more theoretical research directions, and end up gravitating to interpretability or evals because they at least have a feedback loop. One thing I think this exercise is "for" is laying some building blocks for "how to think about a situation where your feedbackloop is terrible, and eke as many bits out of it to help you focus your strategy.
I don't know whether this will successfully transfer to the domains I care about, but that's one thing exercise is aiming at.
This exercise uses the Baba is You videogame to teach a combination of rationality skills (which I suspect weave together into something greater than the sum of their parts):
Planmaking
Calibration
Inner Sim / Internal Surprise-o-meter
Patience
Those skills weave together into something similar to Murphyjitsu, but with a somewhat different flavor. The exercise is intended to build upon Exercise: Meta-strategy [TODO], and is a stepping stone building towards Generating 10x Plans.
I’ve tested this on ~6 people, including myself. So, this is still experimental, but I think good enough about it to ship it publicly for now. Let me know if you try it. You can post your results in the comments (please spoiler-block them).
Format:
You’ll be given a puzzle video game level, which you haven’t played before.
Instead of fiddling around, playing with the game the way you might normally do… you will just look at the screen, and make a complete plan for solving a given level, before you begin to move your character around.
Write down that plan as a series of steps.
Before you execute your plan, for each step in the plan, consider all the ways that you might be surprised when you execute that step.
Loop through all of your “possible surprises”, and consider if any of them actually seem more likely than your mainline plan. Consider updating your plan. If there is a step that might go multiple ways, try making multiple guesses and plans.
Are you confident in your plan? If so, execute it.
Did the plan go the way you expected? Spend 10 minutes reflecting on what you learned, and what you could have done differently.
I recommend doing the exercise using this google doc worksheet.
(I think this exercise would work well as a meetup where one meetup-organizer has read the post thoroughly, has already done the exercise once, and can help other people who are confused or stuck)
Step 0: Read this blogpost
This is a fairly involved exercise. I’ve broken in down into steps so you only have to think about one thing at a time, but it’s useful to first read through the whole thing so you can see how all the pieces fit together.
Step 1: Download the Game, Pick a Level
We’re practicing planmaking in a game called Baba is You. Baba is You is a really great puzzle game, but we’re adding some additional wrinkles.
My favorite version of this exercise is one where you’ve never played Baba is You before, and part of the task is figuring out the core gameplay mechanics without even interacting with the game. (If you haven’t heard anything about the game before, I highly recommend not looking anything up first)
If you’re a rationalist nerd on LessWrong, you’ve probably heard about or even played the game. That’s fine. Unless you’ve literally beaten the entire game, this exercise works if you play a level you haven’t played before. (I recommend picking a level that introduces at least one new mechanic you haven’t seen before)
So, go download the game on Steam, or whatever device you prefer.
If you’ve never played the game before, I recommend starting with a particular level I made, via:
Hit “escape” when the game starts, and starts giving you an intro cinematic (which is a mild spoiler for the game), and click “Return to map.”[1]
Click “Play Levels” from the menu
Click “Get new levels”
Click “Use level code”
Enter: “K9RG-G8K2”
If you have played some Baba is You, I recommend looking at these levels first:
44YI-7VH7 (“Tiny Pond”)
98x8-I1IC (“The River”)
If you’ve played those already, then basically pick up wherever you left off in the game, and try to find a level that “feels hard-but-not-too-hard.” (A somewhat rough problem is that people who’ve previously played Baba is You tend to stop when they reach a very difficult puzzle, and that means it may be a bit too hard for this exercise to work well. You will unfortunately have to use your own judgment on which levels are a good experience for this.)
Remember to pull up the google doc worksheet.
Step 2: Observation, Orientation and Livelogging
Once you’ve gotten to a level that seems like a good fit, stare at the level for a bit and soak in the details. It will look something like this:
I recommend writing down the all the details that seem relevant.
More generally: I recommend livelogging. Jot down notes about your thought process as things occur to you.
After soaking in the level, start thinking through how to solve it.
If you’re new to the game, you might ask “How do I solve it? What are the rules? I don’t even know what it means to beat a level of this game.” Part of the point of this exercise is you do have information about that, even without having played the game. You’ve played other games before. If you haven’t, you’ve probably interacted with the world and you can make an informed guess about what you need to achieve.
A level might introduce multiple new mechanics you haven’t encountered before. For each new element, I suggest coming up with some guesses as to how that element will behave, or what happens when you try to interact with it.
Step 3: Write out your plan
Okay, now start writing down your best guess plan for solving the level.
This can include things like “I think if I do X, it’ll most likely cause Y to happen, but might cause Z to happen instead. If Y happens, I’ll do Plan A, if Z happens, I’ll do Plan B.”
(You don’t need to go crazy branching out on every possible option for this exercise, just pick the most likely branch, and maybe 1-2 backup plans)
Step 3A: If you get stuck, brainstorm strategies
If you feel like you have no idea what to do, stop and go meta. Try spending 10 minutes brainstorming strategies that might help.
Examples of strategies might include:
Try to break the problem down into subgoals
Notice that you’re tired, and get a snack
List as many dumb ideas as you can
I recommend setting a literal 10 minute timer, and trying to come up with at least 10 strategies.
Step 4: Predict Surprises
For each step in your plan, write down the how likely the plan is to go the way you expect. (i.e. 10% likely, 50% likely, 90% likely, etc).
For step, ask “Does this seem like an area I expect to get surprised? What other things might happen instead of my main prediction?”
You might notice that you actually think there’s a second outcome that feels more likely than your original plan. If so, maybe update your plan + predictions.
When you are done, right down your overall probability that your plan will work.
Step 5: Execute the Plan
Once you have a plan written down, and you’ve thought about how likely you are to be surprised… execute the plan!
...
What happened? Did you get it right on the first try? If so, yay! If not, think more, and see if you can come up with a new plan.
Did you get any “surprise surprises” (as opposed to “surprises in a place you kinda expected to get surprised?”)
Step 5b: Try earnestly 3 times, then, idk screw around
If your plan didn’t work, go back to the drawing board and try again. You’ve lost some imaginary points from Raemon, but, you can still try again to make a followup plan. If your assumptions were wrong, re-examine them and think about what else might work.
Each time you try/fail, I recommend setting another 10 minute timer for “meta-strategy brainstorming.”
If you’ve tried this earnestly 3 times, after the 3rd time, I think it’s fine to switch to just trying to solve the level however you want (i.e. moving your character around the screen, experimenting).
Step 6: Debrief / Meta Reflections
Whether you got the answer right or wrong, now you stop to ask “how did I do? Could I have done better?”
Set a 10 minute timer, and brainstorm potential takeaways. Some possible prompts:
What were some useful thoughts that you thought?
What were the key pivot points in your thinking?
How could you have gotten to the right concept faster?
Summarize the key concept of the solution.
Is there an abstract generalization of that concept?
How does that generalization apply other problems?
What thinking approach brought me to the right answer?
Does that approach generalize?
What were some useful thoughts that I thought?
Followup
I think it’s worth doing this exercise a couple times until you’ve gotten the hang of it. You can do it on different puzzle games.
The next step after that is “try making some actual real life plans for goals that feel somewhat hard/confusing”, and reflect on which parts (if any) seem to transfer. I’m working on some explicit exercises for this, but so far it seems to depend a lot on an individual’s goals. So far, this process seems to take a few days rather than a few hours.
- ^
I’m not actually 100% sure how to do this because I’ve already seen the intro cinematic and it only shows once per user. But I believe it was possible to skip it at the last workshop I ran.
- Optimistic Assumptions, Longterm Planning, and “Cope” by 17 Jul 2024 22:14 UTC; 193 points) (
- Skills from a year of Purposeful Rationality Practice by 18 Sep 2024 2:05 UTC; 185 points) (
- Rationality Research Report: Towards 10x OODA Looping? by 24 Feb 2024 21:06 UTC; 113 points) (
- The “Think It Faster” Exercise by 11 Dec 2024 19:14 UTC; 95 points) (
- Scaffolding for “Noticing Metacognition” by 9 Oct 2024 17:54 UTC; 80 points) (
- “Fractal Strategy” workshop report by 6 Apr 2024 21:26 UTC; 67 points) (
- Interested in Cognitive Bootcamp? by 19 Sep 2024 22:12 UTC; 48 points) (
- Forecasting One-Shot Games by 31 Aug 2024 23:10 UTC; 46 points) (
- Reflections on the Metastrategies Workshop by 24 Oct 2024 18:30 UTC; 41 points) (
- Optimistic Assumptions, Longterm Planning, and “Cope” by 18 Jul 2024 0:06 UTC; 15 points) (EA Forum;
If you’ve already played Baba Is You and are looking for other options: Humble Bundle has a puzzle bundle going for the next 5 days. It’s $10 for 7 games, of which The Witness is the lowest rated at 85% positive, and the rest range from 93-99%
Baba Is You is an unusual puzzle game in a way that seems relevant here.
One way of classifying puzzle games might be on a continuum from logic-based to exploration-based (or, if you like, between logical uncertainty and environmental uncertainty).
At the first extreme you have stuff like Sudoku, or logic grids, or three gods named True, False, and Random, or blue eyes. In these puzzles, you are given all necessary information up-front, and you should (if the puzzle is well-constructed) be able to verify the solution entirely on your own, without requiring an external authority to confirm it.
At the opposite extreme, there’s 20 questions ormastermind or Guess Who?, where the entire point is that necessary information is being withheld and you need to interact with the puzzle to expose it. Knowing all the information is the solution; there would be no point without the concealment.
Baba Is You is pretty close to the first extreme, but not all the way there. It does ask you learn the basic rules of the game by interacting with it, and it does gradually introduce new rules, but most of the difficulty comes from logical uncertainty. Some puzzles do not introduce new rules at all, or only introduce new rules in the sense of exploring the edge cases of a previously-established rule. It also makes the entire puzzle visible at once, so once you understand the rules it becomes a pure logic puzzle.
This exercise relies on the possibility of being empirically surprised, but also on being able to make fairly detailed plans in spite of that possibility. This seems like it requires (or at least heavily benefits from) being at a pretty narrow area within the logic ⇔ exploration continuum, which Baba Is You happens to be exactly situated at.
Most puzzle video games lean more heavily on exploration than that. You mentioned The Witness, which I would classify as primarily exploration-based: each series of puzzles centers around a secret rule that you need to infer through experimentation, and most puzzles are easy once you have figured out the secret rule. (The game Understand, mentioned by another commenter, has the same premise.)
Another puzzle game I recognize from the bundle you linked is Superliminal, which has the premise that you’re inside a dream and solve puzzles using dream-logic. I’d also consider that heavily exploration-based.
The Talos Principle is much closer to Baba Is You’s point on this continuum, with a relatively small number of rules and an emphasis on applying them creatively, although in The Talos Principle you can’t always see the entire puzzle before you begin solving it, and I’d say the puzzle components’ appearances are less suggestive of their functions than the adjectives in Baba Is You, probably making it significantly harder to guess how they’ll behave without doing some experimentation.
Patrick’s Parabox is similar to Baba Is You in that they are both Sokoban games, though I didn’t play too far in Patrick’s Parabox because the puzzles felt more workaday and less mind-bendy and I just got bored. (Though it’s highly rated, so presumably most people didn’t.)
Quick note that I have another exercise in the works about the beginning of Patrick’s Parabox, but after having investigated more I think the rest of the game doesn’t hold up for my purposes.
I like your breakdown of why Baba is You fits exactly here.
I do think most puzzle games lend themselves to some kind of rationality exercise, but not necessarily this one.
It’s 6 months later and I still feel impressed by how this comment articulated what’s important about Baba is You.
Someone recently mentioned another major thing about Baba though, which is that it has a high density of puzzles that imply a few particular potential solutions, which are dead ends. (Whereas many puzzle games have the property of “I feel totally stuck” to “oh, the solution is suddenly obvious.”)
This makes it particularly valuable as a research-taste training tool. (I’m not sure offhand how intrinsic the mechanics of Baba is to this property. You could almost surely design Baba puzzles that don’t have this property, it might just be about a particular skill of the game designer).
I think Causality would be good for this. Levels have their full state visible from the start, and there’s no randomness. There’s a relatively small number of mechanics to learn, though I worry that some of them (particularly around details of movement, like “what will an astronaut do when they can’t move forward any more?”) might be “there are multiple equally good guesses here” which seems suboptimal.
Actually, there’s one detail of state that I’m not sure is visible, in some levels:
When you come out of a portal, which way do you face? I think there’s probably a consistent rule for this but I’m not sure, I could believe that in some levels you just have to try it to see.
Or Understand for 4 EUR which has a highly upvoted lesswrong post recommending it.
Fwiw I tried out Understand and was underwhelmed. (Cool concept but it wasn’t actually better as an exercise than other good puzzle games)
What about Outer Wilds? It’s not strictly a puzzle game, but I think it might go well with this exercise. Also, what games would you recommend for this to someone who has already played every available level in Baba Is You?
I think I’d end up constructing a new exercise for Outer Wilds but could see doing something with ir. (I have started but not completed Outer Wilds)
I think this exercise works best for games where puzzles come in relatively discrete chunks where you can see most of the puzzle at once.
I recently played Outer Wilds and Subnautica, and the exercise I recommend for both of these games is : Get to the end of the game without ever failing.
In subnautica that’s dying once, in Outer Wilds it’s a spoiler to describe what failing is (successfully getting to the end could certainly be argued to be a fail).
I failed in both of these. I played Outer Wilds first and was surprised at my fail, which inspired me to play Subnautica without dying. I got pretty far but also died from a mix of 1 unexpected game mechanic, uncareful measure of another mechanic, lack of redundancy in my contingency plans.
I haven’t beaten every level in the game, but I don’t have access to any levels that I haven’t played before, because the reason I stopped playing was that I had already tried and failed every remaining available level.
(Though I suppose I could cheat and look up the solution for where I got stuck...)
This might not apply to the early levels you’ve focused on in your examples, but an observation I made while playing the more advanced levels of this game was that often there was not just one key concept.
In most puzzle games that I’ve played, I find I can quickly get a sort of feel for the general shape of the solution: I start here, I have to end up there, therefore there must be a step in the middle that bridges those two. This often narrows the possible search space quite a lot, because the missing link has to touch the parts I know about at both the beginning and the end.
Lots of puzzles in Baba Is You have two significant steps in the middle. And this is a huge jump in difficulty, because it means there’s an intermediate state between those two steps and I have no idea what that intermediate state looks like so I can’t use it to infer the shape of those steps. Each of the missing steps has only 1 constraint instead of 2.
I’m not sure I understand. If you have levels leftover that you haven’t beaten beacuse they were too hard, I think this is still a fine exercise (the fact that “it’s hard” isn’t a crux for me. I do think it’s doable, and I think the constraints of the exercise probably help about as much as they hinder.
(You might not succeed at doing succeeding within three tries of one-shotting, but I think you’re more likely to go on to beat the level afterwards, and still learn something from it)
On a literal level, I can’t play “a level I haven’t played before”, which is what the instructions call for.
On a practical level, I’ve already spent multiple hours beating my head against this wall, and when I stopped I had no remotely promising ideas for how to make further progress. (And most of that time was spent staring at the puzzle and thinking hard without interacting with it, so it was already kind of similar to this exercise.)
Admittedly, this was years ago, so maybe it’s time to revisit the puzzle anyway.
I will note that a level editor for this game seems to exist, so in theory you could craft custom levels for this exercise. Though insofar as the point is being potentially-surprised by the rules, maybe that doesn’t help if you aren’t inventing new rules as well.
One note: custom levels now exist and you can go browse them directly even if you’ve beaten the game.
I do agree that this exercise, as-worded, probably nudges towards a flavor of “explicit thinking”, which I don’t think is even necessarily the best strategy for Baba is You overall.
I don’t think this exercise necessarily says “think explicitly” – the section on metacognitive brainstorming is meant to fuzzy/experiential/”go-take-a-shower”/”meditate” style options.
To clarify slightly more: I think it’s fine to a do a hard level you haven’t beaten before, even if you’ve played it.
Nod, these instructions were generated for people doing early levels but I agree about how later levels feel.
Originally, I just had people either start at the beginning, or (if they’d played before), pick a level near where they left off.
This wasn’t great because:
a) the first few levels are tutorials that actively force you to learn the mechanics of the game, and were too “easy” for this exercise.
b) if you’ve played awhile, there is some selection pressure on your available levels where “they’re a place you were previously stumped, and were maybe extra hard.” The one person who didn’t rate this exercise highly at my “Fractal Strategy” workshop was someone who picked an existing Stumped puzzle, beat their head against it for 2 hours and then looked up the solution and found it was something they really wouldn’t have been able to figure out.
Levels vary in how much they feel “fair” for this exercise.
So! I’ve constructed a new opening puzzle for people who have never played Baba is You before, and now run it through a couple playtests and iterated and I think have a decent version of it. I’ve also picked two “partway through the game” puzzles I think are particularly good.
I’ve created a spreadsheet which I plan to update as I do more iterations of the exercise and try other levels.
(Personally, I would be rather intimidated by such a long list of questions at Step 6. I would be thinking something like, question one: why do I think it wasn’t just sheer dumb (lack of) luck? And question two, have I had fun?)
Nod, the prompts are meant to be suggestions and you can come up with your own prompts.
I am intending this exercise primarily for people who are interested in answering those sorts of questions though. (But, I also think the exercise is fun, and worth trying/evaluating on that basis if it feels interesting to you)