Hiring engineers and researchers to help align GPT-3
My team at OpenAI, which works on aligning GPT-3, is hiring ML engineers and researchers. Apply here for the ML engineer role and here for the ML researcher role.
GPT-3 is similar enough to “prosaic” AGI that we can work on key alignment problems without relying on conjecture or speculative analogies. And because GPT-3 is already being deployed in the OpenAI API, its misalignment matters to OpenAI’s bottom line — it would be much better if we had an API that was trying to help the user instead of trying to predict the next word of text from the internet.
I think this puts our team in a great place to have an impact:
If our research succeeds I think it will directly reduce existential risk from AI. This is not meant to be a warm-up problem, I think it’s the real thing.
We are working with state of the art systems that could pose an existential risk if scaled up, and our team’s success actually matters to the people deploying those systems.
We are working on the whole pipeline from “interesting idea” to “production-ready system,” building critical skills and getting empirical feedback on whether our ideas actually work.
We have the real-world problems to motivate alignment research, the financial support to hire more people, and a research vision to execute on. We are bottlenecked by excellent researchers and engineers who are excited to work on alignment.
What the team does
In the past Reflection focused on fine-tuning GPT-3 using a reward function learned from human feedback. Our most recent results are here, and had the unusual virtue of simultaneously being exciting enough to ML researchers to be accepted at NeurIPS while being described by Eliezer as “directly, straight-up relevant to real alignment problems.”
We’re currently working on three things:
[20%] Applying basic alignment approaches to the API, aiming to close the gap between theory and practice.
[60%] Extending existing approaches to tasks that are too hard for humans to evaluate; in particular, we are training models that summarize more text than human trainers have time to read. Our approach is to use weaker ML systems operating over shorter contexts to help oversee stronger ones over longer contexts. This is conceptually straightforward but still poses significant engineering and ML challenges.
[20%] Conceptual research on domains that no one knows how to oversee and empirical work on debates between humans (see our 2019 writeup). I think the biggest open problem is figuring out how and if human overseers can leverage “knowledge” the model acquired during training (see an example here).
If successful, ideas will eventually move up this list, from the conceptual stage to ML prototypes to real deployments. We’re viewing this as practice for integrating alignment into transformative AI deployed by OpenAI or another organization.
What you’d do
Most people on the team do a subset of these core tasks:
Design+build+maintain code for experimenting with novel training strategies for large language models. This infrastructure needs to support a diversity of experimental changes that are hard to anticipate in advance, work as a solid base to build on for 6-12 months, and handle the complexity of working with large language models. Most of our code is maintained by 1-3 people and consumed by 2-4 people (all on the team).
Oversee ML training. Evaluate how well models are learning, figure out why they are learning badly, and identify+prioritize+implement changes to make them learn better. Tune hyperparameters and manage computing resources. Process datasets for machine consumption; understand datasets and how they affect the model’s behavior.
Design and conduct experiments to answer questions about our models or our training strategies.
Design+build+maintain code for delegating work to ~70 people who provide input to training. We automate workflows like sampling text from books, getting multiple workers’ answers to questions about that text, running a language model on those answers, then showing the results to someone else for evaluation. It also involves monitoring worker throughput and quality, automating decisions about what tasks to delegate to whom, and making it easy to add new work or change what people are working on.
Participate in high-level discussion about what the team should be working on, and help brainstorm and prioritize projects and approaches. Complicated projects seem to go more smoothly if everyone understands why they are doing what they are doing, is on the lookout for things that might slip through the cracks, is thinking about the big picture and helping prioritize, and cares about the success of the whole project.
If you are excited about this work, apply here for the ML engineer role and here for the ML researcher role.
- [AN #120]: Tracing the intellectual roots of AI and AI alignment by 7 Oct 2020 17:10 UTC; 13 points) (
- 2 May 2022 8:50 UTC; 1 point) 's comment on Narrative Syncing by (
Does it make sense to apply if I’m Russian? What do you think is the chance of Trump allowing H1B visas next year? Will you even consider foreign applicants? Do you provide green cards?
Let’s not get ahead of ourselves, friend.
It is at least the case that OpenAI has sponsored H1Bs before: https://www.myvisajobs.com/Visa-Sponsor/Openai/1304955.htm
Link to thread.
Worth saying that Eliezer still thinks our team is pretty doomed and this is definitely not a general endorsement of our agenda. I feel excited about our approach and think it may yet work, but I believe Eliezer’s position is that we’re just shuffling around the most important difficulties into the part of the plan that’s vague and speculative.
I think it’s fair to say that Reflection is on the Pareto frontier of {plays ball with MIRI-style concerns, does mainstream ML research}. I’m excited for a future where either we convince MIRI that aligning prosaic AI is plausible, or MIRI convinces us that it isn’t.
how suitable is the research engineering job for people with no background in ml, but who are otherwise strong engineers and mathematicians?
will these jobs be long-term remote? if not, on what timeframe will they be remote?
We expect to be requiring people to work from the office again sometime next year.
ML background is very helpful. Strong engineers who are interested in learning about ML are also welcome to apply though no promises about how well we’ll handle those applications in the current round.
What is the expected time frame of the openings?
I am personally indisposed until ~end of October and may not be ready to start a new job for a little while after that, but would otherwise be very excited for such a role.
Somewhat related, do you have an idea of how many openings there will be? Like, fewer than 3 or more than 20, for example?
The team is currently 7 people and we are hiring 1-2 additional people over the coming months.
I am optimistic that our team and other similar efforts will be hiring more people in the future and continuously scaling up, and that over the long term there could be a lot of people working on these issues.
(The post is definitely written with that in mind and the hope that enthusiasm will translate into more than just hires in the current round. Growth will also depend on how strong the pool of candidates is.)
What’s the quickest way to get up to speed and learn the relevant skills, to the level expected of someone working at OpenAI?
Relevant 80K podcast: https://80000hours.org/podcast/episodes/olsson-and-ziegler-ml-engineering-and-safety/
“I’m from OpenAI, and I’m here to help you”.
Seriously, it’s not obvious that you’re going to do anything but make things worse by trying to make the thing “try to help”. I don’t even see how you could define or encode anything meaningfully related to “helping” at this stage anyway.
As for the bottom line, I can imagine myself buying access to the best possible text predictor, but I can’t imagine myself buying access to something that had been muddied with whatever idea of “helpfulness” you might have. I just don’t want you or your code making that sort of decision for me, thanks.
(Upvoted, because jbash is a good commenter and it’s a pretty reasonable question for someone unacquainted with Paul’s work.)
Hey jbash. So, while you’re quite right in the short term that in general the ‘helpful’ bots we build are irritating and inflexible (e.g. Microsoft’s Clippy), the main point of a lot of Paul’s AI research is to figure out how to define helpfulness in such a way that an ML system can successfully be trained to do it – the hard problem of defining ‘helpfulness’, not the short term version of “did a couple of users say it was helpful and did the boss say ship it”. He’s written about it in this post, and given a big-picture motivation for it here.
It’s abstract and philosophically hard and it’s quite plausibly will just not work out, but I do think Paul is explicitly attempting to solve the hard version of the problem with the full knowledge of what you said.
I think that “imitate a human who is trying to be helpful” is better than “imitate a human who is writing an article on the internet,” even though it’s hard to define “helpful.” I agree that’s not completely obvious for a bunch of reasons.
(GPT-3 is better if your goal is in fact to predict text that people write on the internet, but that’s a minority of API applications.)