AI Safety Needs Great Engineers
Top line: If you think you could write a substantial pull request for a major machine learning library, then major AI safety labs want to interview you today.
I work for Anthropic, an industrial AI research lab focussed on safety. We are bottlenecked on aligned engineering talent. Specifically engineering talent. While we’d always like more ops folk and more researchers, our safety work is limited by a shortage of great engineers.
I’ve spoken to several other AI safety research organisations who feel the same.
Why engineers?
May last year, OpenAI released GPT-3, a system that did surprisingly well at a surprisingly broad range of tasks. While limited in many important ways, a lot of AI safety folk sat up and noticed. Systems like GPT-3 might not themselves be the existential threat that many of us are worried about, but it’s plausible that some of the issues that will be found in such future systems might already be present in GPT-3, and it’s plausible to think solving those issues in GPT-3 will help us solve equivalent issues in those future systems that we are worried about.
As such, AI safety has suddenly developed an empirical subfield. While before we could only make predictions about what might go wrong and how we might fix those things, now we can actually run experiments! Experiments are not and should never be the entirety of the field, but it’s a new and promising direction that leverages a different skill set to more ‘classic’ AI safety.
In particular, the different skill set it leverages is engineering. Running experiments on a real—if weak—AI system requires a substantial stack of custom software, with projects running from hundreds of thousands to millions of lines of code. Dealing with these projects is not a skillset that many folks in AI safety had invested in prior to the last 18 months, and it shows in our recruitment.
What kind of engineers?
Looking at the engineers at Anthropic right now, every one of them was a great software engineer prior to joining AI safety. Every one of them is also easy to get on with. Beyond that, common traits are
experience with distributed systems
experience with numerical systems
caring about, and thinking a lot about, about AI safety
comfortable reading contemporary ML research papers
expertise in security, infrastructure, data, numerics, social science, or one of a dozen other hard-to-find specialities.
This is not a requirements list though. Based on the people working here already, ‘great software engineer’ and ‘easy to get on with’ are hard requirements, but the things in the list above are very much nice-to-haves, with several folks having just one or none of them.
Right now our job listings are bucketed into ‘security engineer’, ‘infrastructure engineer’, ‘research engineer’ and the like because these are the noun phrases that a lot of the people we like identify themselves with. But what we’re actually most concerned about are generally-great software engineers who—ideally—have some extra bit of deep experience that we lack.
How does engineering compare to research?
At Anthropic there is no hard distinction between researchers and engineers. Some other organisations retain the distinction, but the increasing reliance of research on substantial, custom infrastructure is dissolving the boundary at every industrial lab I’m familiar with.
This might be hard to believe. I think the archetypal research-and-engineering organisation is one where the researchers come up with the fun prototypes, and then toss them over the wall to the engineers to clean up and implement. I think the archetype is common enough that it dissuades a lot of engineers from applying to engineering roles, instead applying to research positions where they—when evaluated on a different set of metrics than the ones they’re best at—underperform.
What’s changed in modern AI safety is that the prototypes now require serious engineering, and so prototyping and experimenting is now an engineering problem from the get-go. A thousand-line nested for-loop does not carry research as far as it once did.
I think this might be a hard sell to folks who have endured those older kinds of research organisations, so here are some anecdotes:
The first two authors on GPT-3 are both engineers.
Some of the most pure engineers at Anthropic spend weeks staring at learning curves and experimenting with architectural variants.
One of the most pure researchers at Anthropic has spent a week rewriting an RPC protocol.
The most excited I’ve ever seen Anthropic folk for a new hire was for an engineer who builds academic clusters as a hobby.
Should I apply?
It’s hard to judge sight-unseen whether a specific person would suit AI safety engineering, but a good litmus test is the one given at the top of this post:
With a few weeks’ work, could you—hypothetically! - write a new feature or fix a serious bug in a major ML library?
Are you already there? Could you get there with a month or two of effort?
I like this as a litmus test because it’s very close to what my colleagues and I do all day. If you’re a strong enough engineer to make a successful pull request to PyTorch, you’re likely a strong enough engineer to make a successful pull request to our internal repos.
Actually, the litmus test above is only one half of the actual litmus test I give folk that I meet out and about. The other half is
Tell me your thoughts on AI and the future.
with a pass being a nuanced, well-thought-out response.
Should I skill up?
This post is aimed at folks who already can pass the litmus test. I originally intended to pair it with another post on skilling up to the point of being able to pass the test, but that has turned out to be a much more difficult topic than I expected. For now, I’d recommend starting with 80k’s software engineering guide.
Take homes
We want more great engineers.
If you could write a pull request for a major ML library, you should apply to one of the groups working on empirical AI safety: Anthropic, DeepMind Safety, OpenAI Safety, Apollo andRedwood Research.
If that’s not you but you know one or more great engineers, ask them if they could write a pull request for a major ML library. If yes, tell them to apply to the above groups.
If that’s not you but you’d like it to be, watch this space—we’re working on skilling up advice.
- How to pursue a career in technical AI alignment by 4 Jun 2022 21:36 UTC; 265 points) (EA Forum;
- AI Safety Needs Great Engineers by 23 Nov 2021 21:03 UTC; 98 points) (EA Forum;
- Software engineering—Career review by 8 Feb 2022 6:11 UTC; 93 points) (EA Forum;
- Finally Entering Alignment by 10 Apr 2022 17:01 UTC; 80 points) (
- How to pursue a career in technical AI alignment by 4 Jun 2022 21:11 UTC; 69 points) (
- “Brain enthusiasts” in AI Safety by 18 Jun 2022 9:59 UTC; 63 points) (
- AI safety university groups: a promising opportunity to reduce existential risk by 30 Jun 2022 18:37 UTC; 53 points) (EA Forum;
- AI safety technical research—Career review by 17 Jul 2023 15:34 UTC; 49 points) (EA Forum;
- [AN #169]: Collaborating with humans without human data by 24 Nov 2021 18:30 UTC; 33 points) (
- Skilling-up in ML Engineering for Alignment: request for comments by 23 Apr 2022 15:11 UTC; 19 points) (
- Is a career in making AI systems more secure a meaningful way to mitigate the X-risk posed by AGI? by 13 Feb 2022 7:05 UTC; 14 points) (EA Forum;
- AI safety university groups: a promising opportunity to reduce existential risk by 1 Jul 2022 3:59 UTC; 14 points) (
- AI safety technical research—Career review by 17 Jul 2023 15:34 UTC; 14 points) (
- 14 May 2023 23:32 UTC; 11 points) 's comment on How to pursue a career in technical AI alignment by (EA Forum;
- Skilling-up in ML Engineering for Alignment: request for comments by 24 Apr 2022 6:40 UTC; 8 points) (EA Forum;
- 8 Mar 2022 15:38 UTC; 6 points) 's comment on Objections to effective altruism by (EA Forum;
- Are there any AI Safety labs that will hire self-taught ML engineers? by 6 Apr 2022 23:32 UTC; 5 points) (EA Forum;
- 26 Nov 2021 10:32 UTC; 4 points) 's comment on Latacora might be of interest to some AI Safety organizations by (
- 10 Jun 2022 14:45 UTC; 3 points) 's comment on AGI Safety FAQ / all-dumb-questions-allowed thread by (
- 4 Jan 2022 4:11 UTC; 3 points) 's comment on Alex Ray’s Shortform by (
- 2 Nov 2022 21:07 UTC; 2 points) 's comment on AI Safety Needs Great Product Builders by (
That 80k guide seems aimed at people who don’t yet have any software engineering experience. I’m curious what you think the path is from “Average software engineer with 5+ years experience” to the kind of engineer you’re looking for, since that’s the point I’m starting from.
I’m in a similar place, and had the exact same thought when I looked at the 80k guide.
Andy may have meant to link to this article instead, which also has this podcast companion.
Can someone briefly describe what empirical AI safety work Cohere is doing? I hadn’t heard of them until this post.
This comment reflects those of me and not my employer (Cohere).
We are currently massively growing our safety team on both engineering and product sides and one of our major bottlenecks is the above technical talent. We are currently heavily focused on making our models in production as safe as possible during training and during production. One of the biggest projects to this extent is the safety harness project which should have more information coming out soon. https://docs.cohere.ai/safety-harness/. We are heavily focused on worse-case scenario’s especially as anyone can use our models relatively quickly. Here are 2 of the papers the safety team has worked on in the past. We have much more in the timeline.
Mitigating harm in language models with conditional-likelihood filtration
No News is Good News: A Critique of the One Billion Word Benchmark
I am also interested in this.
Given the discussion around OpenAI plausible increasing overall AI risk, why should we believe that the work will reduce in a net risk reduction?
I don’t like your framing of this as “plausible” but I don’t want to argue that point.
Afaict it boils down to whether you believe in (parts of) their mission, e.g. interpretability of large models and how much that weighs against the marginal increase in race dynamics if any.
That sounds like you are in denail. I didn’t make a statement about whether or not OpenAI raises AI risk but referred to the discussion about whether or not it has. That discussion exist and people like Eliezer argue that OpenAI results in a net risk increase. Being in denail about that discourse is bad. It can help with feeling good when working in the area but it prevents good analysis about the dynamics.
No I take specific issue with the term ‘plausibly’. I don’t have a problem with the term ‘possibly’. Using the term plausibly already presumes judgement over the outcome of the discussion which I did not want to get into (mostly because I don’t have a strong view on this yet). You could of course argue that that’s false balance and if so I would like to hear your argument (but maybe not under this particular post, if people think that it’s too OT)
ETA: if this is just a disagreement about our definitions of the term ‘plausibly’ then nevermind, but your original comment reads to me like you’re taking a side.
Eliezer wrote:
To me it seems reasonable to see that as EY presuming judgement about the effects of OpenAI.
Oh yes I’m aware that he expressed this view. That’s different however from it being objectively plausible (whatever that means). I have the feeling we’re talking past each other a bit. I’m not saying “no-one reputable thinks OpenAI is net-negative for the world”. I’m just pointing out that it’s not as clear-cut as your initial comment made it seem to me.
FWIW, “plausible” sounds to me basically the same as “possibly”. So my guess is this is indeed a linguistic thing.
I’m an engineer, but the positions seem to tend to require living in specific locations, so I cannot apply.
I’m going to take this blog post as the explanation for the rejection I got from Anthropic five mins ago for the researcher position.
As a self-taught programmer who’s dabbled in ML, but has only done front and back-end web work: it’s been pretty frustrating trying to find a way to work on ML or AI safety the last four years. I think some of the very recent developments like RR’s ML boot camp are promising on this front, but I’m pretty surprised that Redwood was surprised they would get 500 applications. We’ve been telling people explicitly “this is an emergency” for years now, but tacitly “but you can’t do anything about it unless you’re a 99th percentile programmer and also positioned in the right place at the right time to apply and live in the bay area.” Or, that’s how it’s felt to me.
I wonder if some subset of the people who weren’t accepted to the Redwood thing could organise a remote self-taught version. They note that “the curriculum emphasises collaborative problem solving and pair programming”, so I think that the supervision Redwood provides would be helpful but not crucial. Probably the biggest bottleneck here would be someone stepping up to organise it (assuming Redwood would be happy to share their curriculum for this version).
I agree that this would be helpful if Redwood shares their curriculum. If someone is willing to take up lead organizing, I’d be happy to help out as much as I can (and I suspect this would be true for a non-insignificant number of people who applied to the thing). I’d do it myself, but I expect not to have the free time to commit to that and do it right in the next few months.
Same here (Not sure yet if I get accepted to AISC though). But I would be happy with helping or co-organizing something like Richard_Ngo suggested. (Although I’ve never organized something like that before) Maybe a virtual version in (Continental?) Europe, if there are enough people
Maybe, we could also send out an invitation to all the people who got rejected to join a Slack channel. (I could set that up, if necessary. Since I don’t have the emails, though, someone would need to send the invitations). There, based on the curriculum, people could form self-study groups on their own with others close-by (or remotely) and talk about difficulties, bugs, etc. Maybe, even the people who got not rejected could join the slack and help to answer questions (if they like and have time, of course)?
I’ve created a discord for the people interested in organizing / collaborating / self-study: https://discord.gg/Ckj4BKUChr People could start with the brief curriculum published in this document, until a full curriculum might be available :)
FYI That invite link has now expired!
Should work again :)
I’m curious what this is referring to—was there public communication to that effect?
From Redwood’s application update (rejecting those who didn’t make the cut):
Oh, I misread, I thought they would have been surprised to get 500 applicants for an open job position.
Sorry, but what is RR?
Redwood research
This might be a false alarm, but “tell me your thoughts on AI and the future” is an extremely counterproductive interview question. You’re presenting it as a litmus test for engineers to apply to themselves, and that’s fine as far as it goes. But if it’s typical or analogous to some other test(s) you use to actually judge incoming hires, it doesn’t bode well. By asking it you are, on some level, filtering for public speaking aptitude and ability to sound impressively thoughtful, two things which probably have little or nothing to do with the work you do.
I realize that might seem like a pedantic point, and you might be asking yourself: “how many smart people who want to work here can’t drop impressive speeches about X? We’ll just refrain from hiring that edge case population.” The reason it’s relevant that your interview “could” be selecting for the wrong thing is because recruitment is an adversarial process, not a random process. You are fighting against other technology companies who have better and more scientific hiring pipelines, and more time and money to build them. Those companies often diligently reject the people who can speak well but not code. The result is the candidates you’re looking at will almost always seem curiously good at answering these questions, and under-performing on actual workplace tasks. Even if this were happening I’m sure you’d believe everything is fine, because your VC money lets you give enormous salaries that obscure the problem and because AI safety companies get a glut of incoming attention from sites like Lesswrong. All the more reason not to waste those things.
Worse, you have now published that question, so you will now get a large amount of people who coach their answers and practice them in front of a mirror in preparation for the interview. “Oh well, most people are honest, it’ll only be like 1/2/5/10/25% of our applicants that...”—again, not necessarily true of your passing applicants, and definitely not necessarily true of applicants rejected or less-well-compensated by your competitors.
I can reassure you that it is in fact a litmus test for engineers to apply to themselves, and that’s as far as it goes.
While part of me is keen to discuss our interview design further, I’m afraid you’ve done a great job of laying out some of the reasons not to!
Glad to hear that :)
Any chance that Anthropic might expand the team to remote international collaboration in the future? I would apply but I am from Ukraine. Many great software companies successfully switched to remote work and covid crysis boosted this practice a lot. So just wondering.
It’s not impossible, but it appears unlikely for the foreseeable future. We do sponsor visas, but if that doesn’t suit then I’d take a look at Cohere.ai, as they’re one org I know of with a safety team who are fully-onboard with remote.
Can you add something about whether Anthropic does or doesn’t allow remote work to the job listings? I’m infering from the lack of any mention of remote work that in-person is strictly required but I’m not sure if that’s what you’re intending.
In-person is required. We’ll add something to the job descriptions in the new year, thanks for the heads up!
I’m an experienced engineer and EA excited to work on these things, but I am only available part time remote because I am raising my kid, so I’m not applying right now.
If I knew of useful FOSS work that was directly applicable I might be spending time doing it.
EleutherAI has a whole project board dedicated to open-source ML, both replicating published papers and doing new research on safety and interpretability.
(opinions my own, etc)
Thanks, I was aware of Eleuther but I wasn’t previously aware how much they cared about alignment-related progress.
I have a background in software engineering but I would like to get into AI safety research.
A problem I have had is that I didn’t know whether I should pursue the research scientist or research engineer paths which seem to be quite different. Becoming a research engineer involves lots of work with ML code whereas to become a research engineer you usually have to get a PhD and do some research.
I read in an older document that there was a bottleneck in talent for research scientists and engineers. However, this seems to have changed according to your post and now there seems to be a greater shortage of research engineers than research scientists.
As a result, I am now leaning more in favor of becoming a research engineer. Another advantage is that the research engineer path seems to have a lower barrier to entry.
Would you be interested in a great engineer who is a skeptic about the alignment problem?
Yes! Though that engineer might not be interested in us.
Well the engineer might be trying to break into AGI research from a traditional software (web and gamedev) background, haha.