I was thinking of things like the Alignment Research Science role. If they talked up “this is a superalignment role”, I’d have an estimate higher than 55%.
We are seeking Researchers to help design and implement experiments for alignment research. Responsibilities may include:
Writing performant and clean code for ML training
Independently running and analyzing ML experiments to diagnose problems and understand which changes are real improvements
Writing clean non-ML code, for example when building interfaces to let workers interact with our models or pipelines for managing human data
Collaborating closely with a small team to balance the need for flexibility and iteration speed in research with the need for stability and reliability in a complex long-lived project
Understanding our high-level research roadmap to help plan and prioritize future experiments
Designing novel approaches for using LLMs in alignment research
Yeah, I think that this is disambiguated by the description of the team:
OpenAI’s Alignment Science research teams are working on technical approaches to ensure that AI systems reliably follow human intent even as their capabilities scale beyond human ability to directly supervise them.
We focus on researching alignment methods that scale and improve as AI capabilities grow. This is one component of several long-term alignment and safety research efforts at OpenAI, which we will provide more details about in the future.
So my guess is that you would call this an alignment role (except for the possibility that the team disappears because of superalignment-collapse-related drama).
Yeah I read those lines, and also “Want to use your engineering skills to push the frontiers of what state-of-the-art language models can accomplish”, and remain skeptical. I think the way OpenAI tends to equivocate on how they use the word “alignment” (or: they use it consistently, but, not in a way that I consider obviously good. Like, I the people working on RLHF a few years ago probably contributed to ChatGPT being released earlier which I think was bad*)
*I like the part where the world feels like it’s actually starting to respond to AI now, but, I think that would have happened later, with more serial-time for various other research to solidify.
(I think this is a broader difference in guesses about what research/approaches are good, which I’m not actually very confident about, esp. compared to habryka, but, is where I’m currently coming from)
*I like the part where the world feels like it’s actually starting to respond to AI now, but, I think that would have happened later, with more serial-time for various other research to solidify.
And with less serial-time for various policy plan to solidify and gain momentum.
If you think we’re irreparably far behind on the technical research, and advocacy / political action is relatively more promising, you might prefer to trade years of timeline for earlier, more widespread awareness of the importance of AI, and a longer relatively long period of people pushing on policy plans.
I was thinking of things like the Alignment Research Science role. If they talked up “this is a superalignment role”, I’d have an estimate higher than 55%.
Yeah, I think that this is disambiguated by the description of the team:
So my guess is that you would call this an alignment role (except for the possibility that the team disappears because of superalignment-collapse-related drama).
Yeah I read those lines, and also “Want to use your engineering skills to push the frontiers of what state-of-the-art language models can accomplish”, and remain skeptical. I think the way OpenAI tends to equivocate on how they use the word “alignment” (or: they use it consistently, but, not in a way that I consider obviously good. Like, I the people working on RLHF a few years ago probably contributed to ChatGPT being released earlier which I think was bad*)
*I like the part where the world feels like it’s actually starting to respond to AI now, but, I think that would have happened later, with more serial-time for various other research to solidify.
(I think this is a broader difference in guesses about what research/approaches are good, which I’m not actually very confident about, esp. compared to habryka, but, is where I’m currently coming from)
Tangent:
And with less serial-time for various policy plan to solidify and gain momentum.
If you think we’re irreparably far behind on the technical research, and advocacy / political action is relatively more promising, you might prefer to trade years of timeline for earlier, more widespread awareness of the importance of AI, and a longer relatively long period of people pushing on policy plans.