I ran an AGISF (technical alignment track) program at the University of Michigan last semester. At the end of the program, I shared a “next steps” document full of ideas for actions to take after finishing the program, and I’ve refined the document into a public version that you can find here. Below are some relevant parts for people in a similar position.
Getting Feedback
I encourage you to apply for advising at 80,000 Hours (80k), a career advising service for people who want to find career paths that best improve the world. 80k views risks from advanced AI as one of the world’s most pressing problems. The advisors aren’t professional AI researchers, but they know a lot about AI safety and can help you think through your plans.
For general advice on technical alignment careers, including links to other career guides, see the Careers in alignment document (by Richard Ngo of OpenAI) from Week 7.
“All skill depends on practice. So too, with memory. If you want to remember something, you need to practice remembering it, not just looking at it.
Retrieval practice—where you shut the book and try to recall what you’ve learned without looking at it—is one of the most effective studying techniques.1 It beat fancier techniques like concept mapping in head-to-head trials.2 Retrieval also works even if you can’t get feedback.3”
You might be able to build skills through traditional routes like taking classes. I took Deep Learning for Computer Vision with Justin Johnson and it was the best EECS course I have ever taken. I highly recommend taking this class as your intro to machine learning.
You might also be able to build skills outside of school (e.g. with online courses, internships, textbooks, research programs, and other projects). Sometimes this is actually a more efficient way to learn advanced things you need. Here are some guides with suggestions:
You do not need to gain comprehensive background knowledge before you can start working on technical AI safety projects. One extreme plan is to learn everything before doing anything, and another extreme plan is to do something before learning anything (and learn the background knowledge you need as it comes up). You probably want to balance somewhere between these two extremes. Experiment to see what works for you!
If you want an even more regular digest than that, check out this link! It updates daily with new ML papers, although many of them won’t be directly related to AI safety.
Import AI is a weekly newsletter covering the significance of recent AI progress
aisafety.community tracks many online groups you can join to connect with people from around the world who care about AI safety. Email me if any of the links are dead and I’ll get you a link that works.
aisafety.video has many channels and podcasts you can follow for more AI safety content.
AI safety organizations
In my view, humanity could use a lot more time to prepare for advanced AI. As such, I oppose most work that significantly reduces the time we have for solving problems, including some work at top AI labs.Even some jobs that contribute to AI safety can substantially reduce the time we have (and, in general, I think that any field can have some seemingly benign jobs that actually cause substantial harm), so it’s important to be cautious and think carefully about the impact of a given role. There are, of course, tradeoffs to consider: see the “Safety Capabilities Balance” section of “X-Risk Analysis for AI Research” and this anonymous advice from 80k.
Furthermore, some technical safety research is not very useful for AI safety, even if the researchers have good intentions. The organizations listed below are trying to reduce catastrophic risks from AI, but they are not necessarily accomplishing this. Again, it’s important to think critically about the effects and contributions of any role you’re considering. Don’t blindly apply to every job you see with “AI safety” in the description.
Here is a big (but not comprehensive) list of organizations doing AI safety work, especially technical research. Inclusion on this list does not imply my endorsement. See also aisafety.world for more, as well as aisafety.world/map (still a work in progress; to discuss additions and improvements you can join the Alignment Ecosystem Development Discord server).
“Our mission is to reduce catastrophic and existential risks from artificial intelligence through technical research and advocacy of machine learning safety in the broader research community.” Also hosts the Intro to ML Safety program.
OpenAI and DeepMind are two of the leading groups for advancing AI capabilities, so I oppose some of the work they’re doing (see above). However, they also have people working on alignment/safety and on governance, and I do recommend those roles.
Anthropic “is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems.” They are a top research lab, but some of their projects (like the Claude chatbot) seem to be accelerating progress towards advanced AI, so again, the situation is complicated and this is not a blanket recommendation for all roles at Anthropic.
Next steps after AGISF at UMich
Link post
I ran an AGISF (technical alignment track) program at the University of Michigan last semester. At the end of the program, I shared a “next steps” document full of ideas for actions to take after finishing the program, and I’ve refined the document into a public version that you can find here. Below are some relevant parts for people in a similar position.
Getting Feedback
I encourage you to apply for advising at 80,000 Hours (80k), a career advising service for people who want to find career paths that best improve the world. 80k views risks from advanced AI as one of the world’s most pressing problems. The advisors aren’t professional AI researchers, but they know a lot about AI safety and can help you think through your plans.
One more option is career coaching from AI Safety Support.
Advice and guides
For general advice on technical alignment careers, including links to other career guides, see the Careers in alignment document (by Richard Ngo of OpenAI) from Week 7.
Career opportunities
The AGI Safety Fundamentals Opportunities Board is an excellent resource for tracking jobs, internships, and other programs.
Getting started with academic research
The Student ML Safety Research Stipend Opportunity provides stipends for students doing empirical ML safety research.
Personal projects and exercises
“All skill depends on practice. So too, with memory. If you want to remember something, you need to practice remembering it, not just looking at it.
Retrieval practice—where you shut the book and try to recall what you’ve learned without looking at it—is one of the most effective studying techniques.1 It beat fancier techniques like concept mapping in head-to-head trials.2 Retrieval also works even if you can’t get feedback.3”
– Scott Young in The 10 Essential Strategies for Deeper Learning
See Technical AI safety exercises and projects for a list of possible projects and exercises to try. Also consider working on a project with a friend!
Upskilling
You might be able to build skills through traditional routes like taking classes. I took Deep Learning for Computer Vision with Justin Johnson and it was the best EECS course I have ever taken. I highly recommend taking this class as your intro to machine learning.
You might also be able to build skills outside of school (e.g. with online courses, internships, textbooks, research programs, and other projects). Sometimes this is actually a more efficient way to learn advanced things you need. Here are some guides with suggestions:
Levelling Up in AI Safety Research Engineering [Public]
How to pursue a career in technical AI alignment
Further alignment resources from AGI Safety Fundamentals.
All these guides have some recommended online courses. I don’t think Andrew Ng’s deep learning course is included anywhere, but it’s another good option. Also consider Coursera’s new course on Mathematics for Machine Learning and Data Science Specialization.
You do not need to gain comprehensive background knowledge before you can start working on technical AI safety projects. One extreme plan is to learn everything before doing anything, and another extreme plan is to do something before learning anything (and learn the background knowledge you need as it comes up). You probably want to balance somewhere between these two extremes. Experiment to see what works for you!
Following research + news
The AGI Safety Fundamentals Newsletter and Opportunities Board.
The ML Safety Newsletter and ML Safety Twitter account can help you stay up-to-date with ML safety news.
The ML Safety subreddit and ML Safety Daily account provide an even more frequent digest.
If you want an even more regular digest than that, check out this link! It updates daily with new ML papers, although many of them won’t be directly related to AI safety.
Import AI is a weekly newsletter covering the significance of recent AI progress
Apart Research posts regular brief YouTube videos with recent AI safety news.
LessWrong is “an online forum and community dedicated to improving human reasoning and decision-making,” and it has a heavy focus on discussing risks from advanced AI. Here are a few posts with interesting non-AI content. For AI content, see:
Common misconceptions about OpenAI
Chris Olah’s views on AGI safety
EfficientZero: How It Works
Counterarguments to the basic AI x-risk case
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
(My understanding of) What Everyone in Technical Alignment is Doing and Why
The next decades might be wild
An overview of 11 proposals for building safe advanced AI
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Refining the Sharp Left Turn threat model, part 1: claims and mechanisms
Frequent arguments about alignment
AGI Ruin: A List of Lethalities (see also DeepMind alignment team opinions on AGI ruin arguments and Where I agree and disagree with Eliezer)
7 traps that (we think) new alignment researchers often fall into
Notice that almost all the AI posts above are cross-posted to the AI Alignment Forum. Unlike LessWrong, posting and commenting on the Alignment Forum “is limited to deeply established researchers in the field.”
The Alignment Newsletter is very good. It’s pretty inactive right now, but you can search through this database of past papers.
aisafety.community tracks many online groups you can join to connect with people from around the world who care about AI safety. Email me if any of the links are dead and I’ll get you a link that works.
aisafety.video has many channels and podcasts you can follow for more AI safety content.
AI safety organizations
In my view, humanity could use a lot more time to prepare for advanced AI. As such, I oppose most work that significantly reduces the time we have for solving problems, including some work at top AI labs. Even some jobs that contribute to AI safety can substantially reduce the time we have (and, in general, I think that any field can have some seemingly benign jobs that actually cause substantial harm), so it’s important to be cautious and think carefully about the impact of a given role. There are, of course, tradeoffs to consider: see the “Safety Capabilities Balance” section of “X-Risk Analysis for AI Research” and this anonymous advice from 80k.
Furthermore, some technical safety research is not very useful for AI safety, even if the researchers have good intentions. The organizations listed below are trying to reduce catastrophic risks from AI, but they are not necessarily accomplishing this. Again, it’s important to think critically about the effects and contributions of any role you’re considering. Don’t blindly apply to every job you see with “AI safety” in the description.
Here is a big (but not comprehensive) list of organizations doing AI safety work, especially technical research. Inclusion on this list does not imply my endorsement. See also aisafety.world for more, as well as aisafety.world/map (still a work in progress; to discuss additions and improvements you can join the Alignment Ecosystem Development Discord server).
Redwood Research
Founded in 2021, its mission is “to align superhuman AI.” Mostly empirical research. Also hosts the MLAB bootcamp and REMIX program.
Machine Intelligence Research Institute (MIRI)
One of the earliest organizations to start worrying about the future of AI. Mostly theoretical research.
Conjecture
A venture capital-funded startup doing AI alignment research in London. Also hosts the Refine program.
Center for AI Safety (CAIS)
“Our mission is to reduce catastrophic and existential risks from artificial intelligence through technical research and advocacy of machine learning safety in the broader research community.” Also hosts the Intro to ML Safety program.
OpenAI and DeepMind are two of the leading groups for advancing AI capabilities, so I oppose some of the work they’re doing (see above). However, they also have people working on alignment/safety and on governance, and I do recommend those roles.
Anthropic “is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems.” They are a top research lab, but some of their projects (like the Claude chatbot) seem to be accelerating progress towards advanced AI, so again, the situation is complicated and this is not a blanket recommendation for all roles at Anthropic.
Alignment Research Center (ARC)
Fund for Alignment Research (FAR)
Aligned AI
Encultured AI
Center on Long-Term Risk (CLR)
Association For Long Term Existence And Resilience (ALTER)
Center for Human-Compatible AI (CHAI) at UC Berkeley
Future of Humanity Institute (FHI) at Oxford
Centre for the Study of Existential Risk (CSER) at Cambridge
NYU Alignment Research Group
Obelisk
Stanford Existential Risks Initiative (SERI)
Runs the ML Alignment Theory Scholars Program (MATS).
Center for Security and Emerging Technology (CSET)
Center for the Governance of AI (GovAI)
AI Impacts
Epoch AI
Apart Research
Several projects including hackathons.
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)
Summer research fellowship.
AI Safety Camp
The effective altruism (EA) movement cares a lot about AI safety, so EA Global conferences from the Centre for Effective Altruism are great for meeting current and aspiring researchers in AI safety.
Open Philanthropy
Ought