it’s surprising just how much of cutting edge research (at least in ML) is dealing with really annoying and stupid bottlenecks. pesky details that seem like they shouldn’t need attention. tools that in a good and just world would simply not break all the time.
i used to assume this was merely because i was inexperienced, and that surely eventually you learn to fix all the stupid problems, and then afterwards you can just spend all your time doing actual real research without constantly needing to context switch to fix stupid things.
however, i’ve started to think that as long as you’re pushing yourself to do novel, cutting edge research (as opposed to carving out a niche and churning out formulaic papers), you will always spend most of your time fixing random stupid things. as you get more experienced, you get bigger things done faster, but the amount of stupidity is conserved. as they say in running- it doesn’t get easier, you just get faster.
as a beginner, you might spend a large part of your research time trying to install CUDA or fighting with python threading. as an experienced researcher, you might spend that time instead diving deep into some complicated distributed training code to fix a deadlock or debugging where some numerical issue is causing a NaN halfway through training.
i think this is important to recognize because you’re much more likely to resolve these issues if you approach them with the right mindset. when you think of something as a core part of your job, you’re more likely to engage your problem solving skills fully to try and find a resolution. on the other hand, if something feels like a brief intrusion into your job, you’re more likely to just hit it with a wrench until the problem goes away so you can actually focus on your job.
in ML research the hit it with a wrench strategy is the classic “google the error message and then run whatever command comes up” loop. to be clear, this is not a bad strategy when deployed properly—this is often the best first thing to try when something breaks, because you don’t have to do a big context switch and lose focus on whatever you were doing before. but it’s easy to end up trapped in this loop for too long. at some point you should switch modes to actively understanding and debugging the code, which is easier to do if you think of your job as mostly being about actively understanding and debugging code.
earlier in my research career i would feel terrible about having spent so much time doing things that were not the “actual” research, which would make me even more likely to just hit things with a wrench, which actually did make me less effective overall. i think shifting my mindset since then has helped me a lot
a corollary is i think even once AI can automate the “google for the error and whack it until it works” loop, this is probably still quite far off from being able to fully automate frontier ML research, though it certainly will make research more pleasant
I agree if I specify ‘quite far off in ability-space’, while acknowledging that I think this may not be ‘quite far off in clock-time’. Sometimes the difference between no skill at a task and very little skill is a larger time and effort gap than the difference between very little skill and substantial skill.
Not only is this true in AI research, it’s true in all science and engineering research. You’re always up against the edge of technology, or it’s not research. And at the edge, you have to use lots of stuff just behind the edge. And one characteristic of stuff just behind the edge is that it doesn’t work without fiddling. And you have to build lots of tools that have little original content, but are needed to manipulate the thing you’re trying to build.
After decades of experience, I would say: any sensible researcher spends a substantial fraction of time trying to get stuff to work, or building prerequisites.
This is for engineering and science research. Maybe you’re doing mathematical or philosophical research; I don’t know what those are like.
Completely agree. I remember a big shift in my performance when I went from “I’m just using programming so that I can eventually build a startup, where I’ll eventually code much less” to “I am a programmer, and I am trying to become exceptional at it.” The shift in mindset was super helpful.
More and more, I’m coming to the belief that one big flaw of basically everyone in general is not realizing how much you needed to deal with annoying and pesky/stupid details to do good research, and I believe some of this dictum also applies to alignment research as well.
There is thankfully more engineering/ML experience in LW which alleviates the issue partially, but still, not realizing that pesky details mattering a lot in research/engineering is a problem that basically no one wants to particularly deal with.
I think there are several reasons this division of labor is very minimal, at least in some places.
You need way more of the ML engineering / fixing stuff skill than ML research. Like, vastly more. There are still a very small handful of people who specialize full time in thinking about research, but they are very few and often very senior. This is partly an artifact of modern ML putting way more emphasis on scale than academia.
Communicating things between people is hard. It’s actually really hard to convey all the context needed to do a task. If someone is good enough to just be told what to do without too much hassle, they’re likely good enough to mostly figure out what to work on themselves.
Convincing people to be excited about your idea is even harder. Everyone has their own pet idea, and you are the first engineer on any idea you have. If you’re not a good engineer, you have a bit of a catch-22: you need promising results to get good engineers excited, but you need engineers to get results. I’ve heard of even very senior researchers finding it hard to get people to work on their ideas, so they just do it themselves.
For sure. The more novel an idea I am trying to test, the deeper I have to go into the lower level programming stuff. I can’t rely on convenient high-level abstractions if my needs are cutting across existing abstractions.
Indeed, I take it as a bad sign of the originality of my idea if it’s too easy to implement in an existing high-level library, or if an LLM can code it up correctly with low-effort prompting.
it’s surprising just how much of cutting edge research (at least in ML) is dealing with really annoying and stupid bottlenecks. pesky details that seem like they shouldn’t need attention. tools that in a good and just world would simply not break all the time.
i used to assume this was merely because i was inexperienced, and that surely eventually you learn to fix all the stupid problems, and then afterwards you can just spend all your time doing actual real research without constantly needing to context switch to fix stupid things.
however, i’ve started to think that as long as you’re pushing yourself to do novel, cutting edge research (as opposed to carving out a niche and churning out formulaic papers), you will always spend most of your time fixing random stupid things. as you get more experienced, you get bigger things done faster, but the amount of stupidity is conserved. as they say in running- it doesn’t get easier, you just get faster.
as a beginner, you might spend a large part of your research time trying to install CUDA or fighting with python threading. as an experienced researcher, you might spend that time instead diving deep into some complicated distributed training code to fix a deadlock or debugging where some numerical issue is causing a NaN halfway through training.
i think this is important to recognize because you’re much more likely to resolve these issues if you approach them with the right mindset. when you think of something as a core part of your job, you’re more likely to engage your problem solving skills fully to try and find a resolution. on the other hand, if something feels like a brief intrusion into your job, you’re more likely to just hit it with a wrench until the problem goes away so you can actually focus on your job.
in ML research the hit it with a wrench strategy is the classic “google the error message and then run whatever command comes up” loop. to be clear, this is not a bad strategy when deployed properly—this is often the best first thing to try when something breaks, because you don’t have to do a big context switch and lose focus on whatever you were doing before. but it’s easy to end up trapped in this loop for too long. at some point you should switch modes to actively understanding and debugging the code, which is easier to do if you think of your job as mostly being about actively understanding and debugging code.
earlier in my research career i would feel terrible about having spent so much time doing things that were not the “actual” research, which would make me even more likely to just hit things with a wrench, which actually did make me less effective overall. i think shifting my mindset since then has helped me a lot
a corollary is i think even once AI can automate the “google for the error and whack it until it works” loop, this is probably still quite far off from being able to fully automate frontier ML research, though it certainly will make research more pleasant
I agree if I specify ‘quite far off in ability-space’, while acknowledging that I think this may not be ‘quite far off in clock-time’. Sometimes the difference between no skill at a task and very little skill is a larger time and effort gap than the difference between very little skill and substantial skill.
Not only is this true in AI research, it’s true in all science and engineering research. You’re always up against the edge of technology, or it’s not research. And at the edge, you have to use lots of stuff just behind the edge. And one characteristic of stuff just behind the edge is that it doesn’t work without fiddling. And you have to build lots of tools that have little original content, but are needed to manipulate the thing you’re trying to build.
After decades of experience, I would say: any sensible researcher spends a substantial fraction of time trying to get stuff to work, or building prerequisites.
This is for engineering and science research. Maybe you’re doing mathematical or philosophical research; I don’t know what those are like.
I can emphathetically say this is not the case in mathematics research.
Interested to hear how you would put this with “research” tabooed. Personally I don’t care if it’s research as long as it works.
Completely agree. I remember a big shift in my performance when I went from “I’m just using programming so that I can eventually build a startup, where I’ll eventually code much less” to “I am a programmer, and I am trying to become exceptional at it.” The shift in mindset was super helpful.
More and more, I’m coming to the belief that one big flaw of basically everyone in general is not realizing how much you needed to deal with annoying and pesky/stupid details to do good research, and I believe some of this dictum also applies to alignment research as well.
There is thankfully more engineering/ML experience in LW which alleviates the issue partially, but still, not realizing that pesky details mattering a lot in research/engineering is a problem that basically no one wants to particularly deal with.
I would hope for some division of labor. There are certainly people out there who can’t do ML research, but can fix Python code.
But I guess, even if you had the Python guy and the budget to pay him, waiting until he fixes the bug would still interrupt your flow.
I think there are several reasons this division of labor is very minimal, at least in some places.
You need way more of the ML engineering / fixing stuff skill than ML research. Like, vastly more. There are still a very small handful of people who specialize full time in thinking about research, but they are very few and often very senior. This is partly an artifact of modern ML putting way more emphasis on scale than academia.
Communicating things between people is hard. It’s actually really hard to convey all the context needed to do a task. If someone is good enough to just be told what to do without too much hassle, they’re likely good enough to mostly figure out what to work on themselves.
Convincing people to be excited about your idea is even harder. Everyone has their own pet idea, and you are the first engineer on any idea you have. If you’re not a good engineer, you have a bit of a catch-22: you need promising results to get good engineers excited, but you need engineers to get results. I’ve heard of even very senior researchers finding it hard to get people to work on their ideas, so they just do it themselves.
This is encouraging to hear as someone with relatively little ML research skill in comparison to experience with engineering/fixing stuff.
For sure. The more novel an idea I am trying to test, the deeper I have to go into the lower level programming stuff. I can’t rely on convenient high-level abstractions if my needs are cutting across existing abstractions.
Indeed, I take it as a bad sign of the originality of my idea if it’s too easy to implement in an existing high-level library, or if an LLM can code it up correctly with low-effort prompting.