Qualities that alignment mentors value in junior researchers
This work was performed as a contractor for SERI MATS, but the views expressed are my own and do not necessarily reflect the views of the organization.
I recently conducted interviews with 7 current/former SERI MATS mentors. One of my goals was to understand the qualities that MATS mentors believe are most valuable for junior alignment researchers. I asked questions like:
Who were your most promising scholars? What made them stand out? What impressed you about them?
What are some important qualities or skills that you see missing from most MATS scholars?
What qualities were your scholars most missing? What are some things that you wish they had, or that would’ve made them more impactful?
Qualities that MATS mentors value
Endurance, happiness, & perseverance: Mentors noted that many scholars get discouraged if they’re not able to quickly come up with a promising research direction quickly, or if they explore 1-2 directions that don’t end up being promising. Mentors commented that their most promising scholars were ones who stay energetic/curious/relentless even when they don’t have a clear direction yet.
Hustle + resourcefulness: What do you do when you get stuck? Mentors said that many scholars don’t know what to do when they’re stuck, but their promising mentees were able to be resourceful. They would read related things, email people for help, find a relevant Discord server, browse Twitter, and contact other MATS scholars + AIS researchers for help.
Ability to ask for help + social agency: Many scholars waste a lot of time trying to figure things out on their own. Mentors noted that their most promising scholars were very agentic; they often found other scholars in the program who could help them or other Berkeley researchers who could help them. This also saved mentors time.
Ability to get to know other scholars + engage in peer mentorship: According to mentors, many scholars rarely interacted with others in the stream/program. Some of the best scholars were able to form productive/mutualistic relationships with other scholars.
Strong & concrete models of AI safety: Mentors noted that strong models are important but also hard to acquire. Some mentors emphasized that you often don’t get them until you have talked with people who have good models and you’ve spent a lot of time trying to solve problems. Others emphasized that you often don’t get them until you’ve spent a lot of time thinking about the problem for yourself.
According to one mentor, the best way to get them is just to work closely with a mentor who has these models. No good substitute for just talking to mentors.
Additionally, mentors noted that reading is undervalued. People have written up how they think about things. One mentor said they have read “everything on Paul’s blog, which was super valuable.”
ML and LLM expertise: Some mentors valued ML skills, lots of experience playing around with language models, and strong intuitions around prompt engineering. (Unsurprisingly, this was especially true for mentors whose research interests focused on large language models).
Research communication skills: Being better at efficiently/compactly getting across what they did and what their main problems/bottlenecks were. Some mentors noted that they felt like their (limited) time in meetings with scholars could have been used more effectively if scholars were better at knowing how to communicate ideas succinctly, prioritize the most important points, and generally get better at “leading/steering” meetings.
A few observations
I was surprised at how often mentors brought up points relating to social skills, mental health, and motivation. I used to be a PhD student in clinical psychology, so I was wondering if I was somehow “fishing” for these kinds of answers, but even when I asked very open-ended questions, these were often in the top 3 things that mentors listed.
It seems plausible that general training in things like “what to do when you’re stuck on a problem”, “how to use your network to effectively find solutions”, “when & how to ask for help”, “how to stay motivated even when you’re lost”, “how to lead meetings with your research mentors”, and “how to generally take care of your mental health” could be useful.
When I converse with junior folks about what qualities they’re missing, they often focus on things like “not being smart enough” or “not being a genius” or “not having a PhD.” It’s interesting to notice differences between what junior folks think they’re missing & what mentors think they’re missing.
I think many of these are highly malleable and all of these are at least somewhat malleable. I hope that readers come away with “ah yes, here are some specific skills I can work on developing” as opposed to “oh I don’t naturally have X, therefore I can never be a good researcher.” (Also, many great researchers have deficits in at least 1-2 of these areas).
Note: These interviews focused on mentors’ experiences during the MATS Summer and Autumn 2022 Cohorts. The current Winter 2022-23 Cohort added some related features, including the scholar support team, the Alignment 201 curriculum, technical writing and research strategy workshops, a Community Manager, regular networking events, and a team of alumni from past cohorts to support current scholars. Feel free to use the MATS contact form if you have further questions about the program.
- AI Safety − 7 months of discussion in 17 minutes by 15 Mar 2023 23:41 UTC; 89 points) (EA Forum;
- A model of research skill by 8 Jan 2024 0:13 UTC; 55 points) (
- AI Safety − 7 months of discussion in 17 minutes by 15 Mar 2023 23:41 UTC; 25 points) (
- 4 May 2023 1:32 UTC; 18 points) 's comment on How MATS addresses “mass movement building” concerns by (EA Forum;
- A model of research skill by 8 Jan 2024 0:13 UTC; 14 points) (EA Forum;
- 4 May 2023 1:31 UTC; 6 points) 's comment on How MATS addresses “mass movement building” concerns by (
There may also be social reasons to give different answers depending on whether you are a mentor or mentee. I.e., answering “the better mentees were those who were smarter” seems like an uncomfortable thing to say, even if it’s true.
(I do not want to say that this social explanation is the only reason that answers between mentors and mentees differed. But I do think that one should take it into account in one’s models)
+1. I’ll note though that there are some socially acceptable ways of indicating “smarter” (e.g., better reasoning, better judgment, better research taste). I was on the lookout for these kinds of statements, and I rarely found them. The closest thing that came up commonly was the “strong and concrete models of AI safety” (which could be loosely translated into “having better and smarter thoughts about alignment”).
+1, though I will note that skills 2-5 listed here are pretty strongly correlated with being smarter. It’s possible the mentors are factoring the skills differently (more politely?)
This issue is real, it’s the thing that frustrates me most about alignment pipeline-building work in general right now. There are very likely some important formal/theoretical areas of alignment research that really do need to recruit mostly for something like ‘genius’. But a lot more of the active work that’s getting done (and a way more of the hard-to-fill open jobs) depend much, much more on skills 1–5 here much more than on intelligence in that sense.
(This is on the margin. Here I’m focused on the actual population of people who tend to be interested in ML alignment research, so I’m baking in the assumption that all of the candidates could, say, get above-average grades in a STEM undergrad degree at a top-100 university if they tried.)
As someone who’s supervised/trained ML researchers for ~8 years now, I’d pretty much always hire someone who’s 90th-percentile on two or three of these skills than someone who’s no better than 70th percentile but has world-class IMO (or IOI) performance or a verified IQ of 160 or some other classic raw intelligence signal.
A good chunk of the general skills, at least when summarized like this:
seem like things that I would learn in a PhD program (granted, some of them seem like things you would need to figure out for yourself, where the advisor can’t help a ton). I’m not sure a PhD is the most efficient possible way to learn these things, but at least it has a blueprint I can follow, where I will probably end up at where I want to be.
Since you have a first-hand perspective on this, would you say I’m off the mark here?
As Sam says, PhDs are notoriously hard on mental health, and I think this is very not conducive to learning for most people.
For example, as someone who was a PhD student, I think I learned how to do these things:
only in the few months after leaving my PhD, though a lot of the learning was based on experiences in my PhD.
I mostly agree, but it’s messy. I don’t think it’s obvious that a PhD is anywhere near the ideal way to pick up some of these skills, or that earning a PhD definitely means that you’ve picked them up, but PhD programs do include lots of nudges in these directions, and PhD-holders are going to be much stronger than average at most of this.
In particular, like Johannes said, doing a PhD is notoriously hard on mental health for a number of reasons, even at a more-supportive-than-average lab. So to the extent that they teach ‘taking care of your mental health’ and ‘staying motivated when you’re lost’, it’s often by throwing you into stressful, confusing work situations without great resources and giving you the degree if you figure out how to navigate them.
I have not done a PhD. But my two cents here are that none of these skills seem very teachable, by traditional teaching methods. I would be surprised if people try to teach modern half of these things explicitly in a PhD. And I don’t expect that they will teach them very well. I expect that you will need to figure out most of these things yourself. I have heard that most PhD students get depressed. That doesn’t sound like they have good models of how the mind works and how to take care of their mental health. Though all off it depends on how good the people around you are of course.
Thanks for writing this, I generally agree with most of the points. Detailed commentary below:
It also consistently surprises me how little people in the SERI MATS cohorts read. E.g. I used to read 1-2 papers a day in decent detail as a grad student, and even now still read 3-5 a week. (I probably spent 1+ hour a day reading on average.) Would recommend doing a lot more of it, especially in adjacent fields.
In addition to gaining more knowledge, reading widely also helps a lot with research communication skills.
I don’t think I’m the mentor listed, but I have read everything on all three of Paul’s blogs (ai alignment, sideways view, and rational altruist) and did find it pretty valuable.
That being said, I wouldn’t recommend reading ~all of the three blogs. I think there’s quite strong diminishing marginal returns after the first one or two dozen posts. My guess is reading more of the academic coursework or mainline results for areas people are interested in is far more valuable at that point.
I agree that these are pretty malleable. For example, about ~1 year ago, I was probably two standard deviations less relentless and motivated in research topics, and probably a standard deviation on hustle/resourcefulness.
(That being said, if you asked me a year ago what my main problem is, I’d probably have said low motivation/executive function, and not “I’m not smart enough”.)
Interesting! Would be very curious to hear if there were specific things you think caused the change.
I’m pretty sure I’m the person being quoted here, and I was only referring to https://ai-alignment.com/.
If I read 1-2 papers in a day in detail, I wouldn’t do much else. I guess people get better at this to some extent. I’m wondering if this is something I just need to carry on doing and eventually I’ll get better at it or there are some other ways to make this process more efficient.
This post did something strange to my mind.
I already thought that thinking about the problem yourself is important, and basically required if you want to become a good researcher. At least that was the result of some explicit reasoning steps.
However I then talked to a person about this, and they told me basically the opposite. That it is not required to think about the problems yourself. That is okay to dedicate this thinking to other people. And that’s these people probably know better what to do than you, as they have already thought about it for a long time.
This did not sit well with me. However now I realize that some part of my brain was convinced to some degree by this argument. Now that my brain feels validated, for thinking the (likely) correct thing. Now in some sense I believe again that thinking about the problem yourself is a really important part of becoming a researcher. And that makes it easy to see, that I did not do it enough.
I guess I should be a lot more careful, about what other people put in my brain.
One other quality I’d add: Emotional stability, even in the face of things not adding up to normality.
One big thing that all Alignment researchers should have (including junior researchers) is the ability to be able to make sure they can emotionally accept that sometimes, things don’t add up to normality, and that in a field like alignment, they will need to be able to at least in theory accept that the world may be doomed, but that shouldn’t break them or have their emotions make things worse.