Dmitry Vaintrob comments on Dmitry Vaintrob’s Shortform

Dmitry Vaintrob 7 Jan 2025 20:48 UTC
2 points
0
Thanks for the context! I didn’t follow this discourse very closely, but I think your “optimistic assumptions” post wasn’t the main offender—it’s reasonable to say that “it’s suspicious when people are bad at backchaining but think they’re good at backchaining or their job depends on backchaining more than they are able to”. I seem to remember reading some responses/ related posts that I had more issues with, where the takeaway was explicitly that “alignment researchers should try harder at backchaining and one-shotting baba-is-you-like problems because that’s the most important thing”, instead of the more obvious but less rationalism-vibed takeaway of “you must (if at all possible) avoid situations where you have to one-shot complicated games”.

I think if I’m reading you correctly, we’re largely in agreement. All plan-making and game-playing depends on some amount of backchaining/ one-shot prediction. And there is a part of doing science that looks a bit like this. But there are ways of getting around having to brute-force this by noticing regularities and developing intuitions, taking “explore” directions in explore-exploit tradeoffs, etc. -- this is sort of the whole point of RL, for example.

I also very much like the points you made about plans. I’d love to understand more about your OODA loop points, but I haven’t yet been able to find a good “layperson” operationalization of OODA that’s not competence porn (in general, I find “sequential problem-solving” stuff coming from pilot training useful as inspiration, but not directly applicable because the context is so different—and I’d love a good reference here that addresses this carefully).

A vaguely related picture I had in my mind when thinking about the Baba is you discourse (and writing this shortform) comes from being a competitive chess player in middle school. Namely, in middle school competitions and in friendly training games in chess club, people make a big deal out of the “touch move” rule: that you’re not allowed to play around with pieces when planning and you need to form a plan entirely in your head. But then when you see a friendly game between two high-level chess players, they will constantly play around with each other’s pieces to show each other positions several moves into the game that would result from various choices. To someone on a high level (higher than I ever got to), there is very little difference between playing out a game on the board and playing it out in your head, but it’s helpful to move pieces around to communicate your ideas to your partner. I think that (even with a scratchpad), there’s a component of this here: there is a kind of qualitative difference between “learning to track hypothetical positions well” / “learning results” / “being good at memorization and flashcards” vs. having better intuitions and ideas. A lot of learning a field / being a novice in anything consists of being good at the former. But I think “science” as it were progresses by people getting good at the latter. Here I actually don’t think that the “do better vibe” corresponds to not being good at generating new ideas: rather, I think that rationalists (correctly) cultivate a novice mentality, where they constantly learn new skills and approach new areas, where the “train an area of your brain to track sequential behaviors well” (analogous to “mentally chain several moves forward in Baba is you”) is the core skill. And then when rationalists do develop this area and start running “have and test your ideas and intuitions in this environment” loops, these are harder to communicate/ analyze, and so their importance sort of falls off in the discourse (while on an individual level people are often quite good at these—in fact, the very skill of “communicating well about sequential thinking” is something that many rationalists have developed deep competence in I think).