I would like to gain mastery in the domain of alignment research. Deliberate practice is a powerful sledge hammer for gaining mastery. But unlike something like chess or piano, it’s not clear to me how to use this sledge hammer for this domain. The feedback loops are extremely long, and the “correct action” is almost never known ahead of time or even right after doing the action.
What are some concrete ways I could apply deliberate practice to alignment research?
One way would be to apply it to skills that are sub-components of research, rather than trying to rapidly practice research end-to-end.
The sub-skill I’ve thought of that is the best fit to deliberate practice is solving math and physics problems, a la Thinking Physics or other textbook exercises. Being better at this would certainly make me a better researcher, but it might not be worth the opportunity cost, and if I ask myself, “Is this cutting the enemy with every strike?” then I get back a no.
Another thing I can think of is trying to deliberately practice writing, which is a big part of my research. I could try to be more like John, and write a post every week, to get lots of quick feedback. But is this fast enough for deliberate practice? I get the sense that the feedback cycle has to be almost real-time. Maybe doing tweet explanations is the minimal version of this?
I’d appreciate any other concrete ideas! (Note that my research style is much more mathy/agent-foundations flavored, so programming is not really a sub-skill of my research.)
I’ll tell you what I’m doing right now to practice paper reading, for my grad program in the field of BME. Maybe you can adapt it for your field.
I created a spreadsheet. In this spreadsheet, I have columns for the following categories:
Item
Detail
Research gap
Translational relevance
Function
Mechanism
Finding
In the “Item” column, I list each major thing being used in the experiment. For example, if a lentiviral vector was delivered via a scaffold to transduce macrophages with an IL-10 gene in a mouse spinal cord injury model, “Item” would include “Lentiviral vector,” “scaffold,” “macrophage,” “IL-10,” and “mouse spinal cord injury model.”
“Detail” includes the specifics of each item (i.e. it might say that the mouse spinal cord injury was a C5 left-sided hemisection). “Research gap” is where I list whatever the author says they’re doing something new with the item relative to past research. “Translational relevant” states why the item is relevant to curing disease in humans (obviously you’d want a different spin on this for your field). “Function” describes the purpose of the item in the experiment. “Mechanism” is how the item works to accomplish that function. “Finding” is what the paper found—from the author’s own words if you must, but preferably your own view of what the data shows.
Filling out this spreadsheet helps me think in a critical, systematic way about each paper. I actually find that it makes them easier to read at the same time as it makes me read more deeply.
I just spent a week practicing Thinking Physics. (I’m not 100% sure I hit the specific target of “deliberate practice.”) I definitely think it was worth doing for a couple days. I don’t know that I’d do Thinking Physics in particular for more than a couple weeks, but it doesn’t feel “fake” to me to spend 2 weeks on it.
I set the goal of “be 95% confident in the answer”, as a rough proxy for “there were no major lingering confusions about the problem except for generic ‘maybe I missed something’”. I later added a subgoal of “specifically predict the concept that the exercise was aiming to teach.”
I alternated between “training” and “testing”. During training, I’d do a combination of things including:
Work on the problem on my own for at least 20 minutes
Work on the problem with a friend for at least another 20 minutes (sharing notes on how to approach it)
Meta-reflection on how to do better at Thinking Physics problems.
Err more on the side of “looking at the answer”, even if you could think more, since there are diminishing returns to learning independently, and there’s a bit of an intuition to build of “which questions are actually good and worth spending more time on?”
During testing, I would work entirely on my own, aim to thoroughly deconfuse myself without help, and giving myself multiple days per question while aiming to get 0 wrong. (Technically was aiming to get 5% wrong, but with multiple days per question it’d be a pretty slow process to do 20 questions and accept 1 of them being wrong, so I instead just tried to do ~5 questions and get them all right)
My result after ~1 week of work was… well, still getting 1 out of 5 questions wrong in my test set. But I have a pretty clear sense of where I went wrong there (I was still explicitly confused about a sub-problem, I just decided I was exhausted from the question and running out of time in the two-week sprint before I needed to return to focusing on LessWrong)
A clever idea I was told a few weeks ago was to read the abstracts of papers, then without looking at the actual content, try to answer the same questions they’re asking. Once you’ve tried your hand, compare your attempts with theirs, and make notes on improvements you could make, or disagreements you have with them.